Arun Addagatla
10/23/2024, 6:18 PMuser
10/23/2024, 6:19 PMpython import airbyte as ab source = ab.get_source( "source-github", config={ "credentials": { "personal_access_token": "your_personal_access_token" }, "repositories": ["your_github_username/your_repository_ID"] } )
[Source Setup : Github with PyAirbyte](https://airbyte.com/tutorials/end-to-end-rag-using-github-pyairbyte-and-chroma-vector-db) 2. After configuring the source, you can select the specific streams you want to sync. The GitHub connector supports various streams, including commits, issues, pull requests, and more. For example, to select the commits stream: python source.select_streams(["commits"])
3. Then, you can read the data from the selected streams: python cache = ab.get_default_cache() result = source.read(cache=cache) commits_details = [doc for doc in result["commits"].to_documents()]
[Source Setup : Github with PyAirbyte](https://airbyte.com/tutorials/end-to-end-rag-using-github-pyairbyte-and-chroma-vector-db) Alternatively, you can use the AirbyteLoader
from the langchain_airbyte
package: python from langchain_airbyte import AirbyteLoader from langchain.schema import Document loader = AirbyteLoader( source="source-github", stream="commits", config={ "credentials": { "personal_access_token": "your_personal_access_token" }, "repositories": ["your_username/repository_name"] } ) docs = loader.load() docs = [Document(page_content=record.page_content, metadata=record.metadata) for record in docs]
[Load Data using AirbyteLoader](https://airbyte.com/tutorials/illustrating-the-usage-of-langchain-airbyte-package) Remember to replace "your_personal_access_token"
, "your_github_username/your_repository_ID"
, and "your_username/repository_name"
with your actual GitHub credentials and repository information. Note that the GitHub connector allows you to pull various types of data from a repository, including commits, issues, pull requests, and more. The exact content you can pull depends on the streams you select and the permissions associated with your GitHub token.