GitHub

Github integration with Graffo

Configuration

GitHub source connector integrates with the GitHub REST API to extract and synchronize data.

Connects to your GitHub repositories.

It supports syncing repository metadata, directory structures, and code files with configurable filtering options for branches and file types.

View Source Code

Explore the Github connector implementation at: https://github.com/graffo-ai/graffo/tree/main/backend/graffo/platform/sources/github.py

Authentication

This connector uses a custom authentication configuration.

Authentication Configuration

GitHub authentication credentials schema.

Parameter
Type
Required
Description

personal_access_token

str

Yes

GitHub PAT with read rights (code, contents, metadata) to the repository

Configuration Options

The following configuration options are available for this connector:

Configuration Parameters

Github configuration schema.

Parameter
Type
Required
Default
Description

repo_name

str

Yes

-

Repository to sync in owner/repo format (e.g., 'graffo-io/graffo')

branch

str

No

""

Specific branch to sync (e.g., 'main', 'development'). If empty, uses the default branch.

Data Models

The following data models are available for this connector:

GitHubRepositoryEntity

Schema for GitHub repository entity.

Field
Type
Description

repo_id

int

Unique GitHub repository ID

name

str

Repository short name

created_at

datetime

Timestamp when the repository was created

updated_at

datetime

Timestamp when the repository was last updated

full_name

str

Full repository name including owner

description

Optional[str]

Repository description

default_branch

str

Default branch of the repository

language

Optional[str]

Primary language of the repository

fork

bool

Whether the repository is a fork

size

int

Size of the repository in KB

stars_count

Optional[int]

Number of stars

watchers_count

Optional[int]

Number of watchers

forks_count

Optional[int]

Number of forks

open_issues_count

Optional[int]

Number of open issues

GitHubDirectoryEntity

Schema for GitHub directory entity.

Field
Type
Description

full_path

str

Repository-qualified directory path (owner/repo/path)

name

str

Directory name

path

str

Path of the directory within the repository

repo_name

str

Name of the repository containing this directory

repo_owner

str

Owner of the repository

branch

str

Branch that was traversed for this directory

GitHubCodeFileEntity

Schema for GitHub code file entity.

Field
Type
Description

full_path

str

Repository-qualified file path (owner/repo/path)

name

str

Filename

branch

str

Branch that was traversed for this file

sha

str

SHA hash of the file content

line_count

Optional[int]

Number of lines in the file

is_binary

bool

Flag indicating if file is binary

html_url

Optional[str]

HTML URL for viewing this content on GitHub.

GithubRepoEntity

Schema for a GitHub repository (alternative schema).

References: https://docs.github.com/en/rest/repos/repos?apiVersion=2022-11-28

Note: This is an alternative repository entity schema. Consider using GitHubRepositoryEntity.

Field
Type
Description

full_name

Optional[str]

Full name (including owner) of the repo.

owner_login

Optional[str]

Login/username of the repository owner.

private

bool

Whether the repository is private.

description

Optional[str]

Short description of the repository.

fork

bool

Whether this repository is a fork.

pushed_at

Optional[datetime]

When the repository was last pushed.

homepage

Optional[str]

Homepage URL for the repository.

size

Optional[int]

Size of the repository (in kilobytes).

stargazers_count

int

Number of stars on this repository.

watchers_count

int

Number of people watching this repository.

language

Optional[str]

Primary language of the repository.

forks_count

int

Number of forks for this repository.

open_issues_count

int

Number of open issues on this repository.

topics

List[str]

Topics/tags applied to this repo.

default_branch

Optional[str]

Default branch name of the repository.

archived

bool

Whether the repository is archived.

disabled

bool

Whether the repository is disabled in GitHub.

GithubContentEntity

Schema for a GitHub repository's content (file, directory, submodule, etc.).

References: https://docs.github.com/en/rest/repos/contents?apiVersion=2022-11-28

Note: This is a generic content entity. Consider using specific entities like GitHubCodeFileEntity or GitHubDirectoryEntity.

Field
Type
Description

repo_full_name

Optional[str]

Full name of the parent repository.

path

Optional[str]

Path of the file or directory within the repo.

sha

Optional[str]

SHA identifier for this content item.

item_type

Optional[str]

Type of content. Typically 'file', 'dir', 'submodule', or 'symlink'.

size

Optional[int]

Size of the content (in bytes).

html_url

Optional[str]

HTML URL for viewing this content on GitHub.

download_url

Optional[str]

Direct download URL if applicable.

content

Optional[str]

File content (base64-encoded) if retrieved via 'mediaType=raw' or similar.

encoding

Optional[str]

Indicates the encoding of the content (e.g., 'base64').

GitHubFileDeletionEntity

Schema for GitHub file deletion entity.

This entity is used to signal that a file has been removed from the repository and should be deleted from the destination.

Field
Type
Description

full_path

str

Repository-qualified file path (owner/repo/path)

deletion_label

str

Human-readable deletion label

file_path

str

Path of the deleted file within the repository

repo_name

str

Name of the repository containing the deleted file

repo_owner

str

Owner of the repository

branch

Optional[str]

Branch context for the deleted file

Setting up a GitHub Personal Access Token for Graffo

To connect your GitHub repositories to Graffo, you'll need to create a Personal Access Token (PAT) with the appropriate permissions. This guide walks you through the process of creating and configuring a fine-grained token for use with Graffo.

Step 1: Access Developer Settings in GitHub

Navigate to your GitHub account settings by clicking on your profile picture in the top right corner, then select "Settings". From there, scroll down to find and click on "Developer settings" in the left sidebar.

Finding Developer Settings in GitHub

Step 2: Create a New Fine-Grained Token

In the Developer settings page, select "Fine-grained tokens" from the left menu, then click on "Generate new token".

Fine-grained tokens section

Step 3: Configure Your Token

Fill out the token form with the following details:

  1. Token name: Choose a descriptive name like "Graffo Integration"

  2. Expiration: Select an appropriate expiration date (recommended: 1 year for production use)

  3. Repository access: Choose either "All repositories" or select specific repositories you want to connect to Graffo

Creating a new token

Step 4: Set Required Permissions

For the GitHub connector to work properly, you need to grant the following permissions:

Under "Repository permissions":

  • Set "Contents" to "Read-only" - This allows Graffo to read repository files

Setting content permissions

Step 5: Generate and Save Your Token

After configuring the permissions, scroll to the bottom of the page and click "Generate token".

Important: GitHub will display your token only once. Make sure to copy and store it in a secure location, as you won't be able to view it again.

Step 6: Add Your Token to Graffo

When setting up the GitHub connector in Graffo:

  1. Authentication: Paste your personal access token in the "Personal Access Token" field

  2. Configuration: Enter the repository name in the format owner/repo (e.g., graffo-ai/graffo) in the "Repository Name" configuration field

Your GitHub repository is now connected to Graffo and ready for synchronization.

Last updated