Quick Start: Knowledge Table File Upload

Prepare your file for RAG

1. Introduction

This guide demonstrates how to use JamAI Base SDK to upload and embed files into Knowledge Tables for AI-powered document processing and retrieval.

What are Knowledge Tables?

Knowledge Tables are specialized tables in JamAI Base that provide hybrid-search capabilities through both full-text search (FTS) and vector embeddings:

  1. Search Capabilities:

    • Full-Text Search (FTS): Traditional keyword-based search for exact and partial matches

    • Semantic Search: Vector embedding-based search for meaning and context

  2. Document Processing:

    • Automatically chunks documents into manageable segments

    • Generates vector embeddings for semantic understanding

    • Indexes content for full-text search

    • Preserves document structure(tables, layouts, etc) and metadata

  3. Use Cases:

    • Document retrieval using both keywords and semantic meaning

    • Question-answering agent

    • Content recommendation

    • Knowledge base search and discovery

Supported File Types

The following file formats are supported:

  • Text files: .txt, .md, .csv, .tsv

  • Documents: .doc, .docx, .pdf

  • Presentations: .ppt, .pptx

  • Spreadsheets: .xls, .xlsx

  • Markup/Data: .xml, .html, .json, .jsonl

Prerequisites

Before starting, you'll need:

  • Python 3.10 or higher

  • Project ID and Personal Access Token (PAT)

  • Documents to process

2. Installation and Setup

Installing Required Packages

Basic Configuration

3. Creating Your Knowledge Table

  1. Navigate to your JamAI Base knowledge tables tab

  2. Create a new knowledge table

  3. Note down the table ID for later use

4. Implementation

4.1 Complete Document Uploader Class

5. Complete Standalone Script

Save this as knowledge_uploader.py:

6. Usage Examples

Single File Upload

Folder Upload

Custom Chunk Settings

7. Example Output

For folder processing:

8. Best Practices

  1. File Handling

    • Always validate files before upload

    • Use appropriate chunk sizes for different document types

    • Handle large files appropriately

  2. Performance

    • Reuse the client instance

    • Process files in batches

    • Consider implementing rate limiting for large batches

  3. Error Handling

    • Validate input files

    • Handle network errors gracefully

    • Provide meaningful error messages

  4. Security

    • Use environment variables for credentials

    • Validate file content when necessary

    • Implement proper access controls

This implementation provides a robust foundation for uploading documents to JamAI Base Knowledge Tables, with support for all official file types and optimal processing settings for each format.

Last updated

Was this helpful?