adxpandas¶

adxpandas is a standalone sub-module that executes Kusto Query Language (KQL) queries directly over one or more pandas DataFrames — no database required.

When to use adxpandas vs adxlite¶

Feature	adxpandas	adxlite
KQL over DataFrames	✅	✅ (via SQLite)
No database dependency	✅	❌ (requires SQLite)
Persistent storage	❌	✅
SQL-accelerated queries	❌	✅
`.append` command	❌	✅
Type-aware storage	❌	✅

Use adxpandas when you want to query DataFrames in-memory without any persistence layer. Use adxlite when you need persistent storage, SQL acceleration, or data ingestion workflows.

Installation¶

pip install adxpandas

Quick Start¶

import pandas as pd
from adxpandas import AdxPandasClient

users = pd.DataFrame({
    "name": ["Ada", "Alan", "Grace"],
    "city": ["London", "London", "Arlington"],
    "score": [10, 20, 30],
})

client = AdxPandasClient({"Users": users})
result = client.query('Users | where city == "London" | project name, score')
print(result)

API Reference¶

`AdxPandasClient`¶

class AdxPandasClient:
    def __init__(self, tables: dict[str, pd.DataFrame] | None = None) -> None: ...
    def add_table(self, name: str, dataframe: pd.DataFrame) -> None: ...
    def remove_table(self, name: str) -> None: ...
    def query(self, kql: str) -> pd.DataFrame: ...
    def list_tables(self) -> list[str]: ...

Supported Operators¶

All operators supported by adxlite are supported in adxpandas:

where — filter rows
project — select/rename columns
project-away — remove columns
extend — add computed columns
summarize — aggregate with grouping
sort by / order by — sort rows
top — top N by column
take / limit — limit row count
distinct — deduplicate
count — row count
parse — extract fields from strings
join — join two tables
union — combine tables
render — chart visualization (timechart, barchart, columnchart, piechart, linechart, areachart)

Supported Functions¶

All scalar and aggregate functions supported by adxlite are available in adxpandas. See Functions Reference for the full list.

`let` Statements¶

Both scalar and tabular let bindings are supported:

let threshold = 100;
let active_users = Users | where last_login > ago(7d);
active_users | where score > threshold | project name, score

Unsupported Features¶

.append command (use adxlite for storage operations)
Persistent storage / ingestion
SQLite-specific features

Wrap: Single-DataFrame Quick Query¶

The Wrap class provides a fluent API for querying a single DataFrame:

from adxpandas import Wrap

w = Wrap(df)

# Execute KQL query (use 'self' as the table name)
result = w.execute('self | where city == "London" | project name, score')
print(result.df)

# Method chaining
w.where('score > 10').sort('score desc').take(5).df

# Chart rendering
w.summarize("count()", by="city").render("barchart")

Wrap API¶

class Wrap:
    def __init__(self, df: pd.DataFrame) -> None: ...

    @property
    def df(self) -> pd.DataFrame: ...

    def execute(self, query: str) -> Wrap | RenderResult: ...

    # Chaining methods (all return Wrap)
    def where(self, condition: str) -> Wrap: ...
    def project(self, *columns: str) -> Wrap: ...
    def project_away(self, *columns: str) -> Wrap: ...
    def extend(self, *expressions: str) -> Wrap: ...
    def summarize(self, aggregations: str, by: str | None = None) -> Wrap: ...
    def sort(self, by: str) -> Wrap: ...
    def take(self, n: int) -> Wrap: ...
    def top(self, n: int, by: str) -> Wrap: ...
    def count(self) -> Wrap: ...
    def distinct(self, *columns: str) -> Wrap: ...

    # Terminal method (returns RenderResult)
    def render(self, visualization: str, **kwargs) -> RenderResult: ...

Jupyter Magic¶

The %kql magic enables interactive KQL queries in Jupyter notebooks:

import adxpandas.magic  # registers %kql and %%kql

# Line magic
%kql df | where score > 10 | take 5

# Cell magic
%%kql
df
| where city == "London"
| summarize avg(score) by city

# Capture result
result = %kql df | where score > 10

How it works:

Scans local and global namespace for DataFrames and Wraps
Variable names become table names in the query
Returns Wrap (for data) or RenderResult (if query ends with render)
Requires IPython: pip install adxpandas[notebook]

Render: Chart Visualization¶

The render operator produces charts from query results:

T | summarize count() by city | render barchart
T | summarize avg(score) by bin(ts, 1h) | render timechart
T | ... | render columnchart with (xcolumn=name, title="Scores")

Supported Visualizations¶

Type	Description
`timechart`	Line chart (time series)
`linechart`	General line chart
`barchart`	Horizontal bar chart
`columnchart`	Vertical bar chart
`piechart`	Pie chart
`areachart`	Filled area chart
`table`	Table figure

RenderResult¶

class RenderResult:
    df: pd.DataFrame         # The underlying data
    render_op: RenderOp      # The render specification
    figure: matplotlib.Figure  # The chart (lazily created)

    def _repr_html_(self) -> str: ...  # Jupyter display
    def show(self) -> None: ...        # plt.show()

Requirements¶

Requires matplotlib: pip install adxpandas[notebook]
AdxPandasClient.query() ignores render (always returns DataFrame)
Wrap and magic handle render → RenderResult conversion

Architecture¶

adxpandas consists of:

Parser — shared KQL parser (tokenizer, AST, recursive descent parser)
Functions — pure-Python implementations of KQL scalar functions
Engine — pandas-native execution of KQL operators
PandasOperatorExecutor — applies individual operators to DataFrames
PandasExecutionEngine — orchestrates full query execution including let bindings, unions, and joins

Acknowledgments¶

Some functionality in adxpandas was inspired by and references KustoPandas. We thank the KustoPandas authors for their pioneering work on KQL-over-pandas execution.