GitHub
API Documentation: GitHub API Documentation
High-Level Information: The GitHub connector enables extraction of key repository, user, and workflow data from GitHub’s REST API. It provides access to essential entities such as users, teams, repositories, commits, issues, pull requests, deployments, and workflows, along with detailed metadata like commit files, issue comments, labels, and milestones.
Source Setup Guide
- In your GitHub account, navigate to Settings → Developer Settings → Personal Access Tokens → Classic.
- Click Generate New Access Token.
- When setting permissions, ensure you grant read access to all resources.
GitHub enforces a rate limit on the number of API requests a user can make per hour. To avoid hitting this limit, consider using multiple GitHub accounts to generate personal access tokens. Assign each token to a different connector and use them to retrieve different data streams, ensuring a balanced distribution of requests.
Connection Setup Guide
Once you conneted GitHub to a destination, you will also need to configure:
- Connection Pull Schedule: Determines how frequently data is extracted from the source.
- Backfill (Days): Specifies the duration for which historical data will be retrieved during each connection run.
- Destination specific settings: different settings such as "Dataset Name" or "Target Schema" (depanding on your destination)
- Schema Migration Policy: Controls how Extract will handle schema changes from the sourcee source.
Connector Information
Schema ERD
Explore the interactive entity relationship diagram for Github.