Back to Blog
10 min read

AI-Powered Data Cleaning: How to Prompt ChatGPT to Fix Corrupted JSON Exports

Learn how to use AI prompts to clean corrupted JSON data from exports, APIs, and logs. Includes step-by-step guides and copy-paste templates for instant data recovery.

TL;DR: Use AI to automatically fix corrupted JSON by providing the broken data, describing the expected structure, and requesting specific repairs. This guide provides templates for common corruption scenarios.

The Problem: Corrupted JSON Exports

You export data from a legacy system, scrape an API, or extract logs—and the JSON is broken. Missing quotes, truncated arrays, encoding issues, mixed formats.

Manual cleanup would take hours. AI can fix it in seconds.

Common JSON Corruption Scenarios

Scenario 1: Missing Quotes

{name: Alice, age: 30}

Scenario 2: Truncated Data

{"users":[{"name":"Alice"},{"name":"Bob"},{"na

Scenario 3: Mixed Encoding

{"text":"Caf\u00e9 \u0026 R\u00e9sum\u00e9"}

Scenario 4: Malformed Arrays

[{"id":1}{"id":2}{"id":3}]

Scenario 5: Extra Commas

{"name":"Alice",,,"age":30,}

The AI Repair Workflow

Step 1: Identify the Corruption Type

Quick diagnosis:

  • Parse error at position X → Syntax issue
  • Unexpected end of input → Truncation
  • Invalid character → Encoding problem
  • Missing comma/bracket → Structure issue

Step 2: Prepare the Repair Prompt

Template:

You are a JSON repair specialist. Fix the following corrupted JSON.

Corrupted JSON:
{PASTE_BROKEN_JSON}

Expected structure:
{DESCRIBE_STRUCTURE}

Corruption type: {ISSUE_DESCRIPTION}

Output ONLY the repaired, valid JSON. No explanations.

Step 3: Validate the Repair

Paste the AI output into our JSON Validator to ensure it's syntactically correct.

Copy-Paste Repair Templates

Template 1: Missing Quotes Repair

You are a JSON repair specialist.

Corrupted JSON:
{name: Alice, email: alice@example.com, age: 30}

Issue: Missing quotes around keys and string values

Expected structure: Object with string keys "name", "email" (string values) and "age" (number value)

Fix by:
1. Adding double quotes around all keys
2. Adding double quotes around string values
3. Keeping numbers unquoted

Output ONLY valid JSON:

AI Output:

{"name":"Alice","email":"alice@example.com","age":30}

Template 2: Truncated Data Recovery

You are a JSON repair specialist.

Corrupted JSON (truncated):
{"users":[{"id":1,"name":"Alice"},{"id":2,"name":"Bob"},{"id":3,"na

Expected structure: Object with "users" array containing objects with "id" (number) and "name" (string)

The data was truncated mid-record. Complete the JSON by:
1. Closing the truncated "name" field with a reasonable value
2. Closing all open brackets/braces
3. Ensuring valid syntax

Output ONLY valid JSON:

AI Output:

{"users":[{"id":1,"name":"Alice"},{"id":2,"name":"Bob"},{"id":3,"name":"Charlie"}]}

Template 3: Encoding Fix

You are a JSON repair specialist.

Corrupted JSON:
{"description":"Caf\u00e9 \u0026 R\u00e9sum\u00e9","tags":["caf\u00e9","r\u00e9sum\u00e9"]}

Issue: Unicode escapes that should be decoded

Convert all Unicode escapes (\uXXXX) to their actual characters.

Output ONLY valid JSON with decoded characters:

AI Output:

{"description":"Café & Résumé","tags":["café","résumé"]}

Template 4: Missing Commas/Brackets

You are a JSON repair specialist.

Corrupted JSON:
[{"id":1}{"id":2}{"id":3}]

Issue: Missing commas between array items

Fix by adding commas between objects in the array.

Output ONLY valid JSON:

AI Output:

[{"id":1},{"id":2},{"id":3}]

Template 5: Extra Commas Removal

You are a JSON repair specialist.

Corrupted JSON:
{"name":"Alice",,,"age":30,,"email":"alice@example.com",}

Issue: Extra commas and trailing comma

Fix by:
1. Removing duplicate commas
2. Removing trailing comma

Output ONLY valid JSON:

AI Output:

{"name":"Alice","age":30,"email":"alice@example.com"}

Advanced Repair Scenarios

Scenario 1: Mixed Data Types

Problem:

{"age":"30","active":"true","count":"0"}

Prompt:

Fix this JSON by converting string numbers to actual numbers and string booleans to actual booleans.

Corrupted: {"age":"30","active":"true","count":"0"}

Expected types:
- age: number
- active: boolean
- count: number

Output:

Result:

{"age":30,"active":true,"count":0}

Scenario 2: Nested Structure Repair

Problem:

{"user":{"name":"Alice""email":"alice@example.com"},"status":"active"}

Prompt:

Fix this JSON with missing comma in nested object.

Corrupted: {"user":{"name":"Alice""email":"alice@example.com"},"status":"active"}

Add missing comma between "name" and "email" fields.

Output:

Result:

{"user":{"name":"Alice","email":"alice@example.com"},"status":"active"}

Scenario 3: Array Reconstruction

Problem:

{"items":"[1,2,3]"}

Prompt:

Fix this JSON where array is incorrectly stored as string.

Corrupted: {"items":"[1,2,3]"}

Convert "items" value from string to actual array.

Output:

Result:

{"items":[1,2,3]}

Validate Repaired JSON

After AI repairs your JSON, paste it here to verify it's syntactically correct and ready to use.

Validate Repair →

Pro Tips for AI-Powered Repairs

Tip 1: Provide Context

Bad:

Fix this JSON: {broken data}

Good:

This JSON came from a MySQL export. Fix by adding missing quotes and converting string numbers to actual numbers.

Broken: {data}

Tip 2: Specify Expected Structure

Expected structure:
{
  "users": [
    {"id": number, "name": string, "active": boolean}
  ]
}

Tip 3: Handle Large Files in Chunks

For files >1000 lines:

  1. Split into chunks of 100-200 lines
  2. Repair each chunk separately
  3. Combine repaired chunks
  4. Validate final result

Tip 4: Preserve Original Data

Always keep a backup before AI repair:

cp broken.json broken.json.backup

Real-World Use Cases

Use Case 1: Legacy System Export

Scenario: Export from old CRM system produces malformed JSON

Prompt:

Fix this JSON exported from a legacy CRM system.

Issues:
- Single quotes instead of double quotes
- Dates in non-standard format
- Phone numbers with inconsistent formatting

Corrupted data:
{'customer_name': 'Alice Smith', 'signup_date': '01/15/2024', 'phone': '555.123.4567'}

Convert to:
- Double quotes
- ISO 8601 dates
- Standard phone format (XXX) XXX-XXXX

Output:

Use Case 2: API Scraping Cleanup

Scenario: Scraped API responses have HTML entities

Prompt:

Clean this JSON from web scraping.

Corrupted:
{"title":"Alice\u0026#39;s Project","description":"A \u0026quot;great\u0026quot; tool"}

Decode all HTML entities to actual characters.

Output:

Use Case 3: Log File Parsing

Scenario: Application logs with incomplete JSON entries

Prompt:

Repair these incomplete JSON log entries.

Logs:
{"timestamp":"2024-01-15T10:30:00","level":"ERROR","message":"Failed to
{"timestamp":"2024-01-15T10:31:00","level":"INFO","message":"Success"}

Complete truncated entries with reasonable values and ensure all entries are valid JSON.

Output as JSON array:

Validation Checklist

After AI repair, verify:

Syntax: No parse errors
Structure: Matches expected schema
Data Types: Numbers are numbers, booleans are booleans
Completeness: No truncated values
Encoding: Special characters display correctly
Consistency: Field names match across records

Use our JSON Validator for automated checks.

When AI Repair Fails

Limitation 1: Severe Truncation

If >50% of data is missing, AI can't reliably reconstruct it.

Solution: Request original data or use partial recovery.

Limitation 2: Ambiguous Structure

If the corruption makes structure unclear, AI may guess wrong.

Solution: Provide explicit schema in prompt.

Limitation 3: Binary Data

AI can't repair binary corruption in JSON.

Solution: Re-export from source if possible.

Automation Script

For batch repairs, use this workflow:

import openai
import json
 
def repair_json_with_ai(broken_json, expected_structure):
    prompt = f"""
    You are a JSON repair specialist. Output ONLY valid JSON.
    
    Corrupted JSON:
    {broken_json}
    
    Expected structure:
    {expected_structure}
    
    Fix all syntax errors and output valid JSON:
    """
    
    response = openai.ChatCompletion.create(
        model="gpt-4",
        messages=[{"role": "user", "content": prompt}]
    )
    
    repaired = response.choices[0].message.content
    
    # Validate
    try:
        json.loads(repaired)
        return repaired
    except:
        return None
 
# Usage
broken = '{"name": Alice, "age": 30}'
expected = '{"name": "string", "age": number}'
fixed = repair_json_with_ai(broken, expected)
print(fixed)

Conclusion

AI-powered JSON repair saves hours of manual work. By providing clear context, expected structure, and specific repair instructions, you can fix most corrupted JSON automatically.

Key Takeaways:

  • Identify corruption type first
  • Provide expected structure
  • Use specific repair templates
  • Always validate output
  • Keep backups of original data

Master Repair Template:

Corrupted JSON: {YOUR_DATA}
Expected structure: {SCHEMA}
Issue: {PROBLEM_DESCRIPTION}
Fix by: {SPECIFIC_INSTRUCTIONS}
Output ONLY valid JSON:

Repair Your JSON Now

Use our Smart Repair tool to automatically fix common JSON syntax errors.

Try Smart Repair →