Natural Language to PostgreSQL: Querying with AI

Natural language to PostgreSQL is like having a translator between human thoughts and database queries. Instead of memorizing SQL syntax, you can ask questions in plain English like "Show me all users who signed up last month" and get the exact SQL query you need. It's the difference between learning a foreign language and having an interpreter who speaks it fluently.

Why Natural Language to SQL Matters

Writing SQL queries can be challenging:

  • Complex Syntax: JOINs, subqueries, window functions
  • Schema Knowledge: Understanding table structures and relationships
  • Performance Optimization: Writing efficient queries
  • Error Handling: Debugging syntax and logic errors
  • Accessibility: Making databases usable by non-technical users

AI can help by:

  • Democratizing Data Access: Anyone can query databases
  • Reducing Errors: Fewer syntax mistakes and logical errors
  • Accelerating Development: Faster query writing and debugging
  • Improving Documentation: Self-documenting queries
  • Enabling Natural Interfaces: Chatbots and voice assistants

How Natural Language to SQL Works

Natural Language to SQL Process

Implementation with Vercel AI SDK

Basic Natural Language to SQL Converter

// lib/nl2sql/basicConverter.ts
import { openai } from '@ai-sdk/openai'
import { streamText } from 'ai'

interface DatabaseSchema {
  tables: TableSchema[]
  relationships: Relationship[]
}

interface TableSchema {
  name: string
  description: string
  columns: ColumnSchema[]
}

interface ColumnSchema {
  name: string
  type: string
  description: string
  nullable: boolean
  primaryKey?: boolean
  foreignKey?: {
    table: string
    column: string
  }
}

interface Relationship {
  from: { table: string; column: string }
  to: { table: string; column: string }
  type: 'one-to-one' | 'one-to-many' | 'many-to-many'
}

interface SQLQuery {
  query: string
  explanation: string
  tables: string[]
  estimatedCost: 'low' | 'medium' | 'high'
}

class BasicNL2SQLConverter {
  private schema: DatabaseSchema

  constructor(schema: DatabaseSchema) {
    this.schema = schema
  }

  async convertToSQL(naturalLanguage: string): Promise<SQLQuery> {
    const prompt = this.buildPrompt(naturalLanguage)
    
    const result = await streamText({
      model: openai('gpt-4'),
      messages: [{ role: 'user', content: prompt }],
      temperature: 0.1,
      maxTokens: 1000,
    })

    try {
      const parsed = this.parseSQLResponse(result.text)
      return parsed
    } catch (error) {
      console.error('Failed to parse SQL response:', error)
      throw new Error('Failed to convert natural language to SQL')
    }
  }

  private buildPrompt(naturalLanguage: string): string {
    const schemaDescription = this.buildSchemaDescription()
    
    return `Convert this natural language question to a PostgreSQL query:

Question: "${naturalLanguage}"

Database Schema:
${schemaDescription}

Requirements:
- Use proper PostgreSQL syntax
- Include appropriate JOINs when needed
- Add comments explaining the query logic
- Optimize for performance
- Handle edge cases (NULL values, etc.)
- Use meaningful column aliases
- Format the query for readability

Respond with a JSON object:
{
  "query": "The complete SQL query",
  "explanation": "Explanation of what the query does",
  "tables": ["table1", "table2"],
  "estimatedCost": "low|medium|high"
}

Generate a clean, production-ready PostgreSQL query.`
  }

  private buildSchemaDescription(): string {
    let description = ''

    for (const table of this.schema.tables) {
      description += `\nTable: ${table.name}\n`
      description += `Description: ${table.description}\n`
      description += 'Columns:\n'
      
      for (const column of table.columns) {
        description += `  - ${column.name} (${column.type})`
        if (column.primaryKey) description += ' [PRIMARY KEY]'
        if (column.foreignKey) description += ` [FK -> ${column.foreignKey.table}.${column.foreignKey.column}]`
        if (!column.nullable) description += ' [NOT NULL]'
        description += ` - ${column.description}\n`
      }
    }

    if (this.schema.relationships.length > 0) {
      description += '\nRelationships:\n'
      for (const rel of this.schema.relationships) {
        description += `  - ${rel.from.table}.${rel.from.column} -> ${rel.to.table}.${rel.to.column} (${rel.type})\n`
      }
    }

    return description
  }

  private parseSQLResponse(text: string): SQLQuery {
    // Try to extract JSON from the response
    const jsonMatch = text.match(/\{[\s\S]*\}/)
    if (!jsonMatch) {
      throw new Error('No valid JSON found in response')
    }

    const parsed = JSON.parse(jsonMatch[0])
    
    return {
      query: parsed.query || text,
      explanation: parsed.explanation || 'Generated SQL query',
      tables: parsed.tables || [],
      estimatedCost: parsed.estimatedCost || 'medium',
    }
  }

  async convertMultiple(questions: string[]): Promise<SQLQuery[]> {
    const queries: SQLQuery[] = []

    for (const question of questions) {
      try {
        const query = await this.convertToSQL(question)
        queries.push(query)
      } catch (error) {
        console.error(`Failed to convert: ${question}`, error)
        queries.push({
          query: '-- Failed to generate query',
          explanation: `Error: ${error instanceof Error ? error.message : 'Unknown error'}`,
          tables: [],
          estimatedCost: 'high',
        })
      }
    }

    return queries
  }
}

// Example schema
const sampleSchema: DatabaseSchema = {
  tables: [
    {
      name: 'users',
      description: 'User accounts and profile information',
      columns: [
        { name: 'id', type: 'SERIAL', description: 'Unique user identifier', nullable: false, primaryKey: true },
        { name: 'email', type: 'VARCHAR(255)', description: 'User email address', nullable: false },
        { name: 'name', type: 'VARCHAR(100)', description: 'User full name', nullable: false },
        { name: 'created_at', type: 'TIMESTAMP', description: 'Account creation date', nullable: false },
        { name: 'status', type: 'VARCHAR(20)', description: 'Account status (active, inactive, suspended)', nullable: false },
      ],
    },
    {
      name: 'orders',
      description: 'Customer orders and transactions',
      columns: [
        { name: 'id', type: 'SERIAL', description: 'Unique order identifier', nullable: false, primaryKey: true },
        { name: 'user_id', type: 'INTEGER', description: 'Reference to users table', nullable: false, foreignKey: { table: 'users', column: 'id' } },
        { name: 'total_amount', type: 'DECIMAL(10,2)', description: 'Order total amount', nullable: false },
        { name: 'status', type: 'VARCHAR(20)', description: 'Order status (pending, completed, cancelled)', nullable: false },
        { name: 'created_at', type: 'TIMESTAMP', description: 'Order creation date', nullable: false },
      ],
    },
    {
      name: 'products',
      description: 'Product catalog information',
      columns: [
        { name: 'id', type: 'SERIAL', description: 'Unique product identifier', nullable: false, primaryKey: true },
        { name: 'name', type: 'VARCHAR(200)', description: 'Product name', nullable: false },
        { name: 'price', type: 'DECIMAL(10,2)', description: 'Product price', nullable: false },
        { name: 'category', type: 'VARCHAR(50)', description: 'Product category', nullable: true },
        { name: 'in_stock', type: 'BOOLEAN', description: 'Product availability', nullable: false },
      ],
    },
  ],
  relationships: [
    {
      from: { table: 'orders', column: 'user_id' },
      to: { table: 'users', column: 'id' },
      type: 'many-to-one',
    },
  ],
}

// Usage example
const converter = new BasicNL2SQLConverter(sampleSchema)

const query = await converter.convertToSQL('Show me all users who signed up last month and their total order amounts')
console.log('Generated SQL:', query.query)
console.log('Explanation:', query.explanation)

Advanced NL2SQL with Query Optimization

// lib/nl2sql/advancedConverter.ts
import { Pool } from 'pg'

interface QueryAnalysis {
  query: string
  explanation: string
  tables: string[]
  estimatedCost: 'low' | 'medium' | 'high'
  optimizationSuggestions: string[]
  executionPlan?: any
  estimatedRows?: number
}

interface QueryResult {
  data: any[]
  metadata: {
    rowCount: number
    executionTime: number
    columns: string[]
  }
}

class AdvancedNL2SQLConverter extends BasicNL2SQLConverter {
  private dbPool: Pool
  private queryCache: Map<string, QueryAnalysis> = new Map()

  constructor(schema: DatabaseSchema, dbPool: Pool) {
    super(schema)
    this.dbPool = dbPool
  }

  async convertWithOptimization(naturalLanguage: string): Promise<QueryAnalysis> {
    // Check cache first
    const cacheKey = this.generateCacheKey(naturalLanguage)
    const cached = this.queryCache.get(cacheKey)
    if (cached) {
      return cached
    }

    // Generate initial query
    const initialQuery = await this.convertToSQL(naturalLanguage)
    
    // Analyze and optimize
    const analysis = await this.analyzeAndOptimize(initialQuery)
    
    // Cache the result
    this.queryCache.set(cacheKey, analysis)
    
    return analysis
  }

  private async analyzeAndOptimize(query: SQLQuery): Promise<QueryAnalysis> {
    const analysis: QueryAnalysis = {
      query: query.query,
      explanation: query.explanation,
      tables: query.tables,
      estimatedCost: query.estimatedCost,
      optimizationSuggestions: [],
    }

    try {
      // Get execution plan
      const planResult = await this.dbPool.query(`EXPLAIN (ANALYZE, BUFFERS, FORMAT JSON) ${query.query}`)
      analysis.executionPlan = planResult.rows[0]['QUERY PLAN']

      // Analyze execution plan
      const suggestions = this.analyzeExecutionPlan(analysis.executionPlan)
      analysis.optimizationSuggestions = suggestions

      // Estimate row count
      const countQuery = `SELECT COUNT(*) FROM (${query.query}) as subquery`
      const countResult = await this.dbPool.query(countQuery)
      analysis.estimatedRows = parseInt(countResult.rows[0].count)

      // Update cost estimation
      analysis.estimatedCost = this.estimateCost(analysis.executionPlan, analysis.estimatedRows)

    } catch (error) {
      console.error('Failed to analyze query:', error)
      analysis.optimizationSuggestions.push('Unable to analyze query execution plan')
    }

    return analysis
  }

  private analyzeExecutionPlan(plan: any): string[] {
    const suggestions: string[] = []

    if (!plan || !plan[0]) {
      return ['Unable to analyze execution plan']
    }

    const analyzeNode = (node: any): void => {
      if (node['Node Type'] === 'Seq Scan') {
        suggestions.push('Consider adding an index for sequential scan operations')
      }
      
      if (node['Node Type'] === 'Sort' && node['Sort Method'] === 'external') {
        suggestions.push('Consider adding an index to avoid external sorting')
      }
      
      if (node['Node Type'] === 'Hash Join') {
        suggestions.push('Hash join detected - consider index optimization for better performance')
      }

      if (node['Actual Rows'] && node['Estimated Rows']) {
        const actual = node['Actual Rows']
        const estimated = node['Estimated Rows']
        const ratio = actual / estimated
        
        if (ratio > 10 || ratio < 0.1) {
          suggestions.push('Row count estimation is significantly off - consider updating table statistics')
        }
      }

      // Recursively analyze child nodes
      if (node['Plans']) {
        node['Plans'].forEach(analyzeNode)
      }
    }

    analyzeNode(plan[0])
    return suggestions
  }

  private estimateCost(plan: any, rowCount: number): 'low' | 'medium' | 'high' {
    if (!plan || !plan[0]) return 'medium'

    const totalCost = plan[0]['Total Cost'] || 0
    const actualTime = plan[0]['Actual Total Time'] || 0

    if (totalCost < 1000 && actualTime < 100) return 'low'
    if (totalCost < 10000 && actualTime < 1000) return 'medium'
    return 'high'
  }

  async executeQuery(analysis: QueryAnalysis): Promise<QueryResult> {
    const startTime = Date.now()
    
    try {
      const result = await this.dbPool.query(analysis.query)
      const executionTime = Date.now() - startTime

      return {
        data: result.rows,
        metadata: {
          rowCount: result.rowCount,
          executionTime,
          columns: result.fields.map(field => field.name),
        },
      }
    } catch (error) {
      throw new Error(`Query execution failed: ${error instanceof Error ? error.message : 'Unknown error'}`)
    }
  }

  async suggestOptimizations(query: string): Promise<string[]> {
    const optimizationPrompt = `Analyze this PostgreSQL query and suggest optimizations:

\`\`\`sql
${query}
\`\`\`

Consider:
- Index usage
- JOIN optimization
- WHERE clause efficiency
- Subquery optimization
- Query structure improvements

Provide specific, actionable suggestions.`

    const result = await streamText({
      model: openai('gpt-4'),
      messages: [{ role: 'user', content: optimizationPrompt }],
      temperature: 0.3,
      maxTokens: 500,
    })

    return result.text.split('\n').filter(line => line.trim().length > 0)
  }

  private generateCacheKey(naturalLanguage: string): string {
    return Buffer.from(naturalLanguage.toLowerCase().trim()).toString('base64')
  }

  clearCache(): void {
    this.queryCache.clear()
  }
}

// Usage example
const pool = new Pool({
  connectionString: process.env.DATABASE_URL,
})

const advancedConverter = new AdvancedNL2SQLConverter(sampleSchema, pool)

const analysis = await advancedConverter.convertWithOptimization(
  'Find users who spent more than $1000 in the last 3 months'
)

console.log('Optimized query:', analysis.query)
console.log('Suggestions:', analysis.optimizationSuggestions)

const result = await advancedConverter.executeQuery(analysis)
console.log('Query results:', result.data)

Interactive Query Builder

// lib/nl2sql/interactiveBuilder.ts
interface QueryBuilderState {
  naturalLanguage: string
  generatedSQL: string
  explanation: string
  suggestions: string[]
  isExecuting: boolean
  results: any[]
  error: string | null
}

class InteractiveQueryBuilder {
  private converter: AdvancedNL2SQLConverter
  private state: QueryBuilderState

  constructor(converter: AdvancedNL2SQLConverter) {
    this.converter = converter
    this.state = {
      naturalLanguage: '',
      generatedSQL: '',
      explanation: '',
      suggestions: [],
      isExecuting: false,
      results: [],
      error: null,
    }
  }

  async buildQuery(naturalLanguage: string): Promise<void> {
    this.state.naturalLanguage = naturalLanguage
    this.state.error = null

    try {
      const analysis = await this.converter.convertWithOptimization(naturalLanguage)
      
      this.state.generatedSQL = analysis.query
      this.state.explanation = analysis.explanation
      this.state.suggestions = analysis.optimizationSuggestions
    } catch (error) {
      this.state.error = error instanceof Error ? error.message : 'Failed to generate query'
    }
  }

  async executeQuery(): Promise<void> {
    if (!this.state.generatedSQL) {
      this.state.error = 'No query to execute'
      return
    }

    this.state.isExecuting = true
    this.state.error = null

    try {
      const analysis = await this.converter.convertWithOptimization(this.state.naturalLanguage)
      const result = await this.converter.executeQuery(analysis)
      
      this.state.results = result.data
    } catch (error) {
      this.state.error = error instanceof Error ? error.message : 'Query execution failed'
    } finally {
      this.state.isExecuting = false
    }
  }

  async refineQuery(feedback: string): Promise<void> {
    const refinementPrompt = `Refine this SQL query based on the user feedback:

Original question: "${this.state.naturalLanguage}"
Generated SQL: \`\`\`sql
${this.state.generatedSQL}
\`\`\`

User feedback: "${feedback}"

Please provide an improved SQL query that addresses the feedback.`

    const result = await streamText({
      model: openai('gpt-4'),
      messages: [{ role: 'user', content: refinementPrompt }],
      temperature: 0.3,
      maxTokens: 1000,
    })

    try {
      const jsonMatch = result.text.match(/\{[\s\S]*\}/)
      if (jsonMatch) {
        const parsed = JSON.parse(jsonMatch[0])
        this.state.generatedSQL = parsed.query
        this.state.explanation = parsed.explanation || this.state.explanation
      } else {
        this.state.generatedSQL = result.text
      }
    } catch (error) {
      this.state.error = 'Failed to refine query'
    }
  }

  getState(): QueryBuilderState {
    return { ...this.state }
  }

  async getQueryHistory(): Promise<Array<{
    question: string
    query: string
    timestamp: Date
    executionTime?: number
  }>> {
    // This would typically come from a database
    return []
  }
}

// React component for the interactive builder
// components/InteractiveQueryBuilder.tsx
'use client'

import { useState, useEffect } from 'react'

interface InteractiveQueryBuilderProps {
  converter: AdvancedNL2SQLConverter
}

export function InteractiveQueryBuilder({ converter }: InteractiveQueryBuilderProps) {
  const [builder] = useState(() => new InteractiveQueryBuilder(converter))
  const [state, setState] = useState(builder.getState())
  const [feedback, setFeedback] = useState('')

  const handleBuildQuery = async () => {
    await builder.buildQuery(state.naturalLanguage)
    setState(builder.getState())
  }

  const handleExecuteQuery = async () => {
    await builder.executeQuery()
    setState(builder.getState())
  }

  const handleRefineQuery = async () => {
    await builder.refineQuery(feedback)
    setState(builder.getState())
    setFeedback('')
  }

  return (
    <div className="max-w-6xl mx-auto p-6">
      <div className="mb-8">
        <h1 className="text-3xl font-bold text-gray-900 mb-4">
          Natural Language to SQL Builder
        </h1>
        <p className="text-gray-600">
          Ask questions in plain English and get optimized PostgreSQL queries.
        </p>
      </div>

      <div className="grid grid-cols-1 lg:grid-cols-2 gap-8">
        {/* Input Section */}
        <div className="space-y-6">
          <div>
            <label className="block text-sm font-medium text-gray-700 mb-2">
              Your Question
            </label>
            <textarea
              value={state.naturalLanguage}
              onChange={(e) => setState({ ...state, naturalLanguage: e.target.value })}
              placeholder="Ask a question about your data... (e.g., 'Show me users who signed up last month')"
              className="w-full h-32 p-3 border border-gray-300 rounded-md focus:ring-2 focus:ring-blue-500 focus:border-blue-500"
            />
          </div>

          <div className="flex space-x-4">
            <button
              onClick={handleBuildQuery}
              disabled={!state.naturalLanguage.trim()}
              className="flex-1 bg-blue-600 text-white py-2 px-4 rounded-md hover:bg-blue-700 disabled:opacity-50"
            >
              Generate SQL
            </button>
            <button
              onClick={handleExecuteQuery}
              disabled={!state.generatedSQL || state.isExecuting}
              className="flex-1 bg-green-600 text-white py-2 px-4 rounded-md hover:bg-green-700 disabled:opacity-50"
            >
              {state.isExecuting ? 'Executing...' : 'Execute Query'}
            </button>
          </div>

          {state.error && (
            <div className="p-3 bg-red-50 border border-red-200 rounded-md">
              <p className="text-red-600 text-sm">{state.error}</p>
            </div>
          )}
        </div>

        {/* Output Section */}
        <div className="space-y-6">
          {state.generatedSQL && (
            <>
              <div className="bg-white border border-gray-200 rounded-lg p-6">
                <h3 className="text-lg font-semibold text-gray-900 mb-4">
                  Generated SQL
                </h3>
                <div className="bg-gray-50 rounded-md p-4">
                  <pre className="text-sm text-gray-800 overflow-x-auto">
                    <code>{state.generatedSQL}</code>
                  </pre>
                </div>
                {state.explanation && (
                  <p className="mt-3 text-sm text-gray-600">
                    {state.explanation}
                  </p>
                )}
              </div>

              {state.suggestions.length > 0 && (
                <div className="bg-white border border-gray-200 rounded-lg p-6">
                  <h3 className="text-lg font-semibold text-gray-900 mb-4">
                    Optimization Suggestions
                  </h3>
                  <ul className="space-y-2">
                    {state.suggestions.map((suggestion, index) => (
                      <li key={index} className="text-sm text-gray-600 flex items-start">
                        <span className="text-blue-500 mr-2">•</span>
                        {suggestion}
                      </li>
                    ))}
                  </ul>
                </div>
              )}

              <div className="bg-white border border-gray-200 rounded-lg p-6">
                <h3 className="text-lg font-semibold text-gray-900 mb-4">
                  Refine Query
                </h3>
                <textarea
                  value={feedback}
                  onChange={(e) => setFeedback(e.target.value)}
                  placeholder="Provide feedback to improve the query..."
                  className="w-full h-20 p-3 border border-gray-300 rounded-md focus:ring-2 focus:ring-blue-500 focus:border-blue-500"
                />
                <button
                  onClick={handleRefineQuery}
                  disabled={!feedback.trim()}
                  className="mt-3 bg-purple-600 text-white py-2 px-4 rounded-md hover:bg-purple-700 disabled:opacity-50"
                >
                  Refine Query
                </button>
              </div>
            </>
          )}

          {state.results.length > 0 && (
            <div className="bg-white border border-gray-200 rounded-lg p-6">
              <h3 className="text-lg font-semibold text-gray-900 mb-4">
                Query Results ({state.results.length} rows)
              </h3>
              <div className="overflow-x-auto">
                <table className="min-w-full divide-y divide-gray-200">
                  <thead className="bg-gray-50">
                    <tr>
                      {Object.keys(state.results[0] || {}).map((column) => (
                        <th key={column} className="px-6 py-3 text-left text-xs font-medium text-gray-500 uppercase tracking-wider">
                          {column}
                        </th>
                      ))}
                    </tr>
                  </thead>
                  <tbody className="bg-white divide-y divide-gray-200">
                    {state.results.slice(0, 10).map((row, index) => (
                      <tr key={index}>
                        {Object.values(row).map((value, colIndex) => (
                          <td key={colIndex} className="px-6 py-4 whitespace-nowrap text-sm text-gray-900">
                            {String(value)}
                          </td>
                        ))}
                      </tr>
                    ))}
                  </tbody>
                </table>
                {state.results.length > 10 && (
                  <p className="mt-3 text-sm text-gray-500">
                    Showing first 10 of {state.results.length} results
                  </p>
                )}
              </div>
            </div>
          )}
        </div>
      </div>
    </div>
  )
}

Best Practices

1. Query Validation and Safety

class SafeNL2SQLConverter extends AdvancedNL2SQLConverter {
  private readonly allowedOperations = new Set(['SELECT', 'WITH'])
  private readonly blockedKeywords = new Set(['DROP', 'DELETE', 'UPDATE', 'INSERT', 'TRUNCATE', 'ALTER', 'CREATE'])

  async convertToSQL(naturalLanguage: string): Promise<SQLQuery> {
    const query = await super.convertToSQL(naturalLanguage)
    
    // Validate query safety
    const validation = this.validateQuery(query.query)
    if (!validation.safe) {
      throw new Error(`Query blocked for safety: ${validation.reason}`)
    }

    return query
  }

  private validateQuery(query: string): { safe: boolean; reason?: string } {
    const upperQuery = query.toUpperCase().trim()

    // Check for blocked operations
    for (const blocked of this.blockedKeywords) {
      if (upperQuery.includes(blocked)) {
        return { safe: false, reason: `Operation '${blocked}' is not allowed` }
      }
    }

    // Ensure query starts with allowed operation
    const startsWithAllowed = Array.from(this.allowedOperations).some(op => 
      upperQuery.startsWith(op)
    )
    
    if (!startsWithAllowed) {
      return { safe: false, reason: 'Query must start with SELECT or WITH' }
    }

    // Check for potential SQL injection patterns
    const suspiciousPatterns = [
      /;\s*$/,
      /--\s*$/,
      /\/\*.*\*\//,
      /UNION\s+ALL/i,
      /UNION\s+SELECT/i,
    ]

    for (const pattern of suspiciousPatterns) {
      if (pattern.test(query)) {
        return { safe: false, reason: 'Query contains suspicious patterns' }
      }
    }

    return { safe: true }
  }

  async executeQuery(analysis: QueryAnalysis): Promise<QueryResult> {
    // Add query timeout
    const timeout = setTimeout(() => {
      throw new Error('Query execution timeout')
    }, 30000) // 30 seconds

    try {
      const result = await super.executeQuery(analysis)
      clearTimeout(timeout)
      return result
    } catch (error) {
      clearTimeout(timeout)
      throw error
    }
  }
}

2. Performance Monitoring

class MonitoredNL2SQLConverter extends SafeNL2SQLConverter {
  private queryMetrics: Array<{
    question: string
    query: string
    executionTime: number
    rowCount: number
    timestamp: Date
    success: boolean
  }> = []

  async convertWithOptimization(naturalLanguage: string): Promise<QueryAnalysis> {
    const startTime = Date.now()
    
    try {
      const analysis = await super.convertWithOptimization(naturalLanguage)
      
      this.queryMetrics.push({
        question: naturalLanguage,
        query: analysis.query,
        executionTime: Date.now() - startTime,
        rowCount: analysis.estimatedRows || 0,
        timestamp: new Date(),
        success: true,
      })

      return analysis
    } catch (error) {
      this.queryMetrics.push({
        question: naturalLanguage,
        query: '',
        executionTime: Date.now() - startTime,
        rowCount: 0,
        timestamp: new Date(),
        success: false,
      })
      throw error
    }
  }

  getPerformanceStats(): {
    totalQueries: number
    averageExecutionTime: number
    successRate: number
    mostCommonQuestions: Array<{ question: string; count: number }>
  } {
    const successful = this.queryMetrics.filter(m => m.success)
    const totalTime = successful.reduce((sum, m) => sum + m.executionTime, 0)
    
    const questionCounts = this.queryMetrics.reduce((acc, m) => {
      acc[m.question] = (acc[m.question] || 0) + 1
      return acc
    }, {} as Record<string, number>)

    const mostCommon = Object.entries(questionCounts)
      .sort(([,a], [,b]) => b - a)
      .slice(0, 5)
      .map(([question, count]) => ({ question, count }))

    return {
      totalQueries: this.queryMetrics.length,
      averageExecutionTime: successful.length > 0 ? totalTime / successful.length : 0,
      successRate: this.queryMetrics.length > 0 ? successful.length / this.queryMetrics.length : 0,
      mostCommonQuestions: mostCommon,
    }
  }
}

3. Error Handling and Recovery

class RobustNL2SQLConverter extends MonitoredNL2SQLConverter {
  async convertToSQL(naturalLanguage: string): Promise<SQLQuery> {
    try {
      return await super.convertToSQL(naturalLanguage)
    } catch (error) {
      // Try to provide a helpful fallback
      return this.generateFallbackQuery(naturalLanguage, error)
    }
  }

  private async generateFallbackQuery(naturalLanguage: string, originalError: any): Promise<SQLQuery> {
    const fallbackPrompt = `The original query generation failed with error: ${originalError.message}

Original question: "${naturalLanguage}"

Please generate a simple, safe SELECT query that might help answer this question. 
Focus on basic operations and avoid complex JOINs or subqueries.

Respond with a JSON object:
{
  "query": "Simple SELECT query",
  "explanation": "What this query does",
  "tables": ["table1"],
  "estimatedCost": "low"
}`

    const result = await streamText({
      model: openai('gpt-4'),
      messages: [{ role: 'user', content: fallbackPrompt }],
      temperature: 0.1,
      maxTokens: 500,
    })

    try {
      const jsonMatch = result.text.match(/\{[\s\S]*\}/)
      if (jsonMatch) {
        const parsed = JSON.parse(jsonMatch[0])
        return {
          query: parsed.query,
          explanation: `${parsed.explanation} (Fallback query due to original error)`,
          tables: parsed.tables || [],
          estimatedCost: parsed.estimatedCost || 'low',
        }
      }
    } catch (parseError) {
      // If even the fallback fails, return a basic query
      return {
        query: 'SELECT 1 as result',
        explanation: 'Fallback query - unable to generate specific query',
        tables: [],
        estimatedCost: 'low',
      }
    }

    throw originalError
  }
}

Natural language to PostgreSQL is revolutionizing how we interact with databases. Instead of learning complex SQL syntax, users can ask questions in plain English and get accurate, optimized queries. The key is to build robust systems that understand context, provide helpful feedback, and ensure safety and performance.

With the right implementation, you can make your database accessible to everyone while maintaining the power and flexibility of SQL.