🧠 AudiText AI

The Intelligent Audio Engine for the Semantic Web

AudiText is a next-generation AI-Powered Audio Reader that transforms the static web into immersive, high-fidelity audio experiences. By leveraging advanced Natural Language Processing (NLP) and state-of-the-art Text-to-Speech (TTS), it doesn't just read text—it understands context, declutters noise, and delivers a studio-quality listening experience.

Key Features • AI Capabilities • Tech Stack • Security • Quick Start

🚀 Why AudiText?

In an era of information overload, AudiText serves as your intelligent filter. Unlike standard screen readers that blindly recite metadata and ads, AudiText uses a bespoke Smart Polish Layer to semantically analyze content structure. It identifies the core narrative, strips away "hashtag spam" and repetitive headers, and synthesizes the remaining essence into fluid, human-like speech.

Whether you're commuting with a long-form article or multitasking with a Twitter thread, AudiText ensures you consume knowledge, not noise.

✨ Key Features

🧠 Smart Content Processing

Semantic Text Extraction: Automatically parses complex DOM structures from X (Twitter), Medium, Substack, and more.
AI-Driven Polish Layer:
- Contextual Cleanup: Eliminates "clickbait" hooks, hashtags, and repetitive boilerplate.
- Smart Intro Generation: Synthesizes professional intros ("Title, by Author") even when metadata is sparse.
- Deduplication Engine: Detects and suppresses redundant information for seamless flow.

🎧 Immersive Playback Engine

Native Neural TTS: Leverages the browser's built-in Web Speech API for unlimited, offline-capable speech synthesis without API quotas.
Clean Player Interface: Minimalist, bottom-aligned controls optimized for one-handed mobile use.
Dynamic Speed Control: Variable playback rates (0.5x - 2.5x) with pitch correction.
Deep Linking & Sharing: Share articles with ?share= URL parameters for instant playback.

🛡️ Enterprise-Grade Security

Row Level Security (RLS): Database policies strictly enforce data sovereignty—users can only access their own library items.
Input Hardening: Advanced sanitization prevents SQL/Command injection and XSS attacks via URL inputs.
Auth Integrity: Robust localized authentication handling via Supabase Auth.

📱 Premium UX/UI

"Reactive Noir" Aesthetic: A cohesive design language featuring glassmorphism, adaptive film grain (noise), and procedural gradients.
Mobile-First Progressive Web App (PWA): Touch-optimized scrub bars, haptic feedback integration, and 60fps animations on mobile devices.
Interactive DotGrid Background: GPU-accelerated particle effect with click-to-ripple interaction.

⚡ Production-Grade Performance

React.memo Optimization: All heavy components (DotGrid, Noise, SwipeableItem) are memoized to prevent unnecessary re-renders.
GPU-Accelerated Animations: CSS animations use transform: translateZ(0) and will-change hints for buttery 60fps performance.
Spatial Partitioning: DotGrid uses O(1) spatial grid lookup instead of O(n) for efficient hover detection.
Optimized Bundle: ~656 KB total (148 KB gzipped) with vendor chunk splitting.

🏗️ Cloud & AI Architecture

The system uses a Dual-Layer Extraction Pipeline to ensure reliability even when AI credits are exhausted.

graph TD
    Design[Figma Design] -.-> |"AI Generation (90% Fidelity)"| Components
    User[User / PWA] -->|1. Paste URL| Edge[Supabase Edge Function]
    User -->|Listen| BrowserTTS[Browser Native TTS]
    User -->|Sync| DB[(Supabase Database)]
    
    subgraph Backend [Edge Function: extract-content]
        Edge -->|Fetch Raw HTML| Jina[Jina AI Reader]
        Edge -->|Clean Text| AI_Logic{Has Credits?}
        AI_Logic -->|Yes| Gemini[Google Gemini 2.0]
        AI_Logic -->|No| Manual[Robust Regex Cleaner]
    end
    
    subgraph Frontend [React + Vite + Framer Motion]
        Components[React Components]
        Store[Local Storage] <-->|Cache| State[Audio Context]
        Components -.-> State
        State -->|Audio Data| Visuals
        subgraph Visuals [Visual Engine]
             Bits[react-bits / DotGrid]
             Shimmer[GPU-Accelerated Shimmer]
        end
    end

🛠️ Tech Stack

Frontend Core

Technology	Role
React 19	UI Library with modern hooks architecture
TypeScript	Strict static typing for robustness
Vite	Next-gen frontend tooling and bundling

Visuals & Animation

Technology	Role
Framer Motion	Physics-based UI animations
GSAP (GreenSock)	Commercial-grade transitions for DotGrid
Custom Canvas	GPU-accelerated DotGrid with spatial partitioning
Lucide React	Consistent, lightweight iconography

Backend & Data

Technology	Role
Supabase (PostgreSQL)	Relational database with real-time subscriptions
Supabase Auth	User management and secure session handling
Supabase Edge Functions	Serverless content extraction
Row Level Security (RLS)	Database-level access control policies

AI Services

Technology	Role
Jina AI Reader	URL to clean Markdown extraction
Google Gemini 2.0	Content cleaning and formatting

📸 Experience

⚡ Quick Start

Prerequisites

Node.js 18+
npm or yarn

Installation

Clone the Repository

git clone https://github.com/nabrahma/AudiText.git
cd AudiText

Environment Setup Create a .env file in the root directory:

cp .env.example .env

Populate it with your credentials:

VITE_SUPABASE_URL=your_supabase_url
VITE_SUPABASE_ANON_KEY=your_supabase_anon_key

Install Dependencies
```
npm install
```
Launch Development Server
```
npm run dev
```

🤝 Contributing

We welcome contributions from the community! Whether it's enhancing the AI parsing logic or adding new visual effects.

Fork the repository.
Create a feature branch (git checkout -b feature/EnhancedTTS).
Commit your changes with clear messages (git commit -m 'feat: Add voice selection').
Push to the branch (git push origin feature/EnhancedTTS).
Open a Pull Request.

Edge Function Secrets (Optional / Advanced)

The core AudiText experience (Content Extraction + Native Browser TTS) requires minimal setup. However, the backend infrastructure supports advanced capabilities if you wish to enable them.

Variable Name	Service	Status	Purpose
`JINA_API_KEY`	Jina.ai	Required	Essential for converting raw URLs into clean Markdown.
`GEMINI_API_KEY`	Google Gemini	Recommended	Greatly improves article cleaning and formatting.

📄 License

Distributed under the MIT License. See LICENSE for more information.

Built with 🧠 + ❤️ by Nabaskar

Name		Name	Last commit message	Last commit date
Latest commit History 117 Commits
.vscode		.vscode
docs/screenshots		docs/screenshots
public		public
src		src
supabase		supabase
.env.example		.env.example
.gitignore		.gitignore
README.md		README.md
eslint.config.js		eslint.config.js
index.html		index.html
netlify.toml		netlify.toml
package-lock.json		package-lock.json
package.json		package.json
tailwind.config.js		tailwind.config.js
tsconfig.app.json		tsconfig.app.json
tsconfig.json		tsconfig.json
tsconfig.node.json		tsconfig.node.json
vite.config.ts		vite.config.ts

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

🧠 AudiText AI

🚀 Why AudiText?

✨ Key Features

🧠 Smart Content Processing

🎧 Immersive Playback Engine

🛡️ Enterprise-Grade Security

📱 Premium UX/UI

⚡ Production-Grade Performance

🏗️ Cloud & AI Architecture

🛠️ Tech Stack

Frontend Core

Visuals & Animation

Backend & Data

AI Services

📸 Experience

⚡ Quick Start

Prerequisites

Installation

🤝 Contributing

Edge Function Secrets (Optional / Advanced)

📄 License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

🧠 AudiText AI

🚀 Why AudiText?

✨ Key Features

🧠 Smart Content Processing

🎧 Immersive Playback Engine

🛡️ Enterprise-Grade Security

📱 Premium UX/UI

⚡ Production-Grade Performance

🏗️ Cloud & AI Architecture

🛠️ Tech Stack

Frontend Core

Visuals & Animation

Backend & Data

AI Services

📸 Experience

⚡ Quick Start

Prerequisites

Installation

🤝 Contributing

Edge Function Secrets (Optional / Advanced)

📄 License

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages