A Pythonish-SQL-like query tool for JSON files, built as a Python wrapper around the super awesome but difficult to use jq utility.
- Overview
- Quick Start
- Features
- jonq vs Native
- Installation
- Usage
- Example Simple JSON
- Example with Nested JSON
- Advanced Filtering
- Grouping and Aggregation
- Troubleshooting
- Known Limitations
- Contributing
- License
jonq allows you to query JSON data using familiar Pythonish-SQL-like syntax. It translates these queries into jq filters, making it easier for users familiar with SQL and Python to extract and manipulate data from JSON files without learning the full jq syntax (it's just too complex imo).
- SQL-like Syntax: Query JSON with familiar
SELECTstatements and traverse nested JSONs with pythonic like syntax akaabc.def - Field Selection: Choose specific fields from your JSON data
- Filtering: Filter your results using the
ifkeyword - Sorting: Order results with ascending or descending sort
- Pagination: Limit the number of results returned
- Aggregation Functions: Use functions like
sum(),avg(),count(),max()andmin()
While jq is incredibly powerful, its syntax is a pain to use. I can't be the only one who feels that way right? jonq simplifies JSON querying with familiar, intuitive syntax:
| Task | Native jq | jonq |
|---|---|---|
| Select specific fields | jq '.[] | {name: .name, age: .age}' |
jonq data.json "select name, age" |
| Filter with condition | jq '.[] | select(.age > 30) | {name, age}' |
jonq data.json "select name, age if age > 30" |
| Sort results | jq 'sort_by(.age) | reverse | .[0:2]' |
jonq data.json "select name, age sort age desc 2" |
| Work with nested data | jq '.[] | select(.profile.address.city == "New York") | {name, city: .profile.address.city}' |
jonq data.json "select name, profile.address.city if profile.address.city = 'New York'" |
| Count items | jq 'map(select(.age > 25)) | length' |
jonq data.json "select count(*) as count_over_25 if age > 25" |
| Group & aggregate | jq 'group_by(.city) | map({city: .[0].city, count: length})' |
jonq data.json "select city, count(*) as user_count group by city" |
| Complex filters | jq '.[] | select(.age > 25 and (.city == "New York" or .city == "Chicago"))' |
jonq data.json "select * if age > 25 and (city = 'New York' or city = 'Chicago')" |
As you can see, jonq offers:
- Simpler syntax: I'm not sure how much simpler can it get
- Familiar patterns: Py + SQL-like keywords
- Readability: For human readability
- Faster development: Write complex queries in a fraction of the time
- Python 3.9+
- jq command line tool installed (https://stedolan.github.io/jq/download/)
- Clone this repository:
pip install jonq
Make sure you have jq installed:
jq --version
echo '[{"name":"Alice","age":30},{"name":"Bob","age":25}]' > data.json
jonq data.json "select name, age if age > 25"
The query syntax follows a simplified format:
select <fields> [if <condition>] [sort <field> [asc|desc] [limit]]where:
<fields>- Comma-separated list of fields to select or aggregationsif <condition>- Optional filtering conditiongroup by <fields>- Optional grouping by one or more fieldssort <field>- Optional field to sort byasc|desc- Optional sort direction (default: asc)limit- Optional integer to limit the number of results
You can also refer to the json_test_files for the test jsons and look up USAGE.md guide. Anyway let's start with simple.json.
Image a json like the following:
[
{
"id": 1,
"name": "Alice",
"age": 30,
"city": "New York"
},
{
"id": 2,
"name": "Bob",
"age": 25,
"city": "Los Angeles"
},
{
"id": 3,
"name": "Charlie",
"age": 35,
"city": "Chicago"
}
]jonq path/to/simple.json "select *"jonq path/to/simple.json "select name, age"jonq path/to/simple.json "select name, age if age > 30"jonq path/to/simple.json "select name, age if age > 30"jonq path/to/simple.json "select name, age sort age desc 2"jonq path/to/simple.json "select sum(age) as total_age"
jonq path/to/simple.json "select avg(age) as average_age"
jonq path/to/simple.json "select count(age) as count"Simple enough i hope? Now let's move on to nested jsons
Imagine a nested json like below:
[
{
"id": 1,
"name": "Alice",
"profile": {
"age": 30,
"address": {
"city": "New York",
"zip": "10001"
}
},
"orders": [
{
"order_id": 101,
"item": "Laptop",
"price": 1200
},
{
"order_id": 102,
"item": "Phone",
"price": 800
}
]
},
{
"id": 2,
"name": "Bob",
"profile": {
"age": 25,
"address": {
"city": "Los Angeles",
"zip": "90001"
}
},
"orders": [
{
"order_id": 103,
"item": "Tablet",
"price": 500
}
]
}
]jonq path/to/nested.json "select name, profile.age"
jonq path/to/nested.json "select name, profile.address.city"jonq path/to/nested.json "select name, count(orders) as order_count"jonq supports complex boolean conditions using AND, OR, and parentheses:
jonq nested.json "select name, profile.age if profile.address.city = 'New York' or orders[0].price > 1000"
### Find users who are both under 30 AND from Los Angeles
jonq nested.json "select name, profile.age if profile.age < 30 and profile.address.city = 'Los Angeles'"
### Using parentheses for complex logic
jonq nested.json "select name, profile.age if (profile.age > 25 and profile.address.city = 'New York') or (profile.age < 26 and profile.address.city = 'Los Angeles')"jonq supports grouping data and performing aggregations per group:
# Group by city and count users in each city
jonq nested.json "select profile.address.city, count(*) as user_count group by profile.address.city"
# Group by city and get average age in each city
jonq nested.json "select profile.address.city, avg(profile.age) as avg_age group by profile.address.city"
# Group by city and get total orders and average order price
jonq nested.json "select profile.address.city, count(orders) as order_count, avg(orders.price) as avg_price group by profile.address.city"- Make sure jq is installed on your system
- Verify jq is in your PATH by running
jq --version - Install jq: https://stedolan.github.io/jq/download/
- Check your JSON file for syntax errors
- Verify the file exists and is readable
- Use a JSON validator to check your file structure
- Verify your query follows the correct syntax format
- Ensure field names match exactly what's in your JSON
- Check for missing quotes around string values in conditions
- Verify your condition isn't filtering out all records
- Check if your field names match the casing in the JSON
- For nested fields, ensure the dot notation path is correct
- Write Operations: jonq doesn't support writing results back to files. It's a read-only tool (or at least for now).
- Performance: For very large JSON files (100MB+), processing may be slow.
- Advanced jq Features: Some advanced jq features aren't exposed in the jonq syntax.
- CSV Output: Currently only outputs JSON format. CSV export is planned for future versions.
- Multiple File Joins: No support for joining data from multiple JSON files.
- Custom Functions: User-defined functions aren't supported in the current version.
- Date/Time Operations: Limited support for date/time parsing or manipulation.
Docs here: https://jonq.readthedocs.io/en/latest/
Contributions are welcome! Please feel free to submit a Pull Request.
This project is licensed under the MIT License - see the LICENSE file for details.
- jq: This tool depends on the jq command-line JSON processor, which is licensed under the MIT License. jq is copyright (C) 2012 Stephen Dolan.
The jq tool itself is not included in this package - users need to install it separately.
