10 yrs
of AER regulatory history indexed with full provenance
12
regulatory entity types extracted and cross-linked
~20%
estimated reduction in submission preparation effort
Ausgrid's regulatory affairs team faced a structural memory problem: two complete AER regulatory cycles, 10 years of objections, justifications, accepted and rejected line items, locked in thousands of PDF pages with no systematic way to surface patterns or answer specific questions without days of manual search. Semantica ingested the full public corpus, built a temporal knowledge graph that tracks how positions evolved across cycles, and wired a hybrid retrieval layer that answers cross-cycle questions in under 3 seconds with exact source citations. Teams preparing the 2029–34 submission now start with a complete institutional memory rather than a blank page.
About
Ausgrid is a Distribution Network Service Provider (DNSP) serving greater Sydney and the Hunter Valley, with more than 1.7 million customers across one of Australia's most complex electricity distribution networks. As a regulated monopoly under the National Electricity Law, its allowed revenue is determined every five years by the Australian Energy Regulator (AER) through a formal, multi-stage process. Two complete cycles now exist in the public record: 2019–24 and 2024–29. The 2029–34 cycle is the next horizon. Each determination involves billions of dollars in capital expenditure and operational cost allowances, reviewed against a regulator that publishes its reasoning in forensic detail and references its own prior decisions explicitly.
The Problem
Each AER regulatory cycle produces thousands of pages across a structured document hierarchy: Regulatory Proposals, AER Issues Papers, AER Draft Determinations, Ausgrid Responses, Expert Witness Reports, Alternative Control Services determinations, and Final Determinations. Two complete cycles, 2019–24 and 2024–29, already exist in full. They contain every objection the AER has raised, every cost category Ausgrid has defended, every accepted and rejected justification, and every negotiated outcome. The AER references this history explicitly when assessing new proposals. Ausgrid's submission teams were not.
- 18–24 months per regulatory cycle the most time-intensive and highest-stakes recurring workflow in the business
- Thousands of pages across two completed cycles no single analyst has read them in full, no tool has indexed them with cross-cycle linkage
- 12 distinct regulatory entity types (Revenue Allowances, Capex Programs, Opex Categories, RAB values, Community Outcomes, AER Objections, AG Responses, Expert Witness positions) scattered across unlinked PDFs
- Zero systematic institutional memory when senior regulatory affairs staff turn over between cycles, critical negotiating context walks out the door
- Pattern blindness: identifying which arguments the AER has consistently accepted or rejected requires weeks of manual comparison across thousands of pages
- 2–4 hours of manual document search to answer a single factual question about a prior cycle determination too slow for a live submission process
“The AER does not forget. It holds institutional memory going back decades. Submission teams who start each 18-month cycle from scratch are preparing arguments against a regulator with perfect recall, without an intelligence layer of their own.”
The Solution
Semantica ingested the complete public corpus of AER and Ausgrid regulatory documents across both cycles, parsing PDFs with structure-aware chunking that preserves the regulatory document hierarchy rather than breaking at arbitrary character limits. The output is a temporal knowledge graph in Neo4j where every extracted entity is linked to its source document, part, section heading, and page number, and where cross-cycle evolution is tracked through typed evolves_to edges. A hybrid retrieval layer combining Pinecone semantic search and Neo4j graph traversal answers plain-English questions with exact source citations in under 3 seconds. A two-stage LLM pipeline using step-back query decomposition followed by grounded synthesis with 7 specialist regulatory agent tools surfaces patterns that would otherwise require days of manual review.
System pipeline
- 1
Ingest
the full public document corpus across both cycles is pulled: AER Issues Papers, Draft Determinations, Final Determinations, Ausgrid Regulatory Proposals, all Responses, and Expert Witness submissions
- 2
Parse
StructuralChunker splits each document at section boundaries, preserving the regulatory hierarchy (Regulatory Proposal, Part, Chapter, Section, Paragraph) so no context is broken by chunking
- 3
Extract
NERExtractor identifies 12 entity types across every document: Revenue Allowances, Capex Programs, Opex Categories, RAB values, Community Outcomes, AER Objections, AG Responses, Expert Witness positions, and more
- 4
Annotate
TemporalAnnotator adds typed evolves_to edges between matched entities across cycles, making it possible to ask how AER's position on a cost category changed from 2019 to 2024
- 5
Index
ProvenanceManager links every extracted entity to its source document identifier, part number, section heading, and page so every answer generated by the system is citable by a human reviewer
- 6
Query
a hybrid retrieval layer (Pinecone semantic search and Neo4j graph traversal) with a two-stage LLM pipeline answers plain-English questions with full source citations in under 3 seconds
Example query
MATCH (e1:Expense {category: "Vegetation Management", cycle: "2019-24"})
-[:EVOLVED_TO]->(e2:Expense {category: "Vegetation Management", cycle: "2024-29"})
MATCH (e1)-[:AER_OBJECTION]->(o1:Objection)
MATCH (e2)-[:AER_OBJECTION]->(o2:Objection)
RETURN
e1.allowance AS allowance_2019_24,
o1.reason AS aer_objection_2019,
o1.source_page AS objection_page_2019,
e2.allowance AS allowance_2024_29,
o2.reason AS aer_objection_2024,
o2.source_page AS objection_page_2024,
e2.allowance - e1.allowance AS delta_allowance
-- Returns: position history, objection text, source page refs, and allowance delta
-- Execution time: < 3 seconds
-- Previously: 2–4 hours of manual search across two cycles of documentsSemantica Modules
StructuralChunker
Splits regulatory PDFs at section boundaries rather than character limits, preserving the document hierarchy that gives AER determinations their meaning. A chunk never spans two sections.
NERExtractor
Identifies 12 regulatory entity types across every document in the corpus: Revenue Allowances, Capex Programs, Opex Categories, RAB values, Community Outcomes, AER Objections, AG Responses, Expert Witness positions, and more.
TemporalAnnotator
Adds typed evolves_to edges between matched entities across regulatory cycles, enabling cross-cycle traversal queries such as how AER's position on a specific cost category changed between the two determinations.
ProvenanceManager
Links every extracted entity to its source document, part number, section heading, and page. Every answer is citable. No claim is generated without a verifiable reference.
GraphBuilder
Assembles the temporal knowledge graph in Neo4j with full cross-cycle relationship traversal. Both the 2019–24 and 2024–29 cycles exist as connected layers in the same graph.
PolicyEngine
Evaluates the current regulatory proposal draft against AER compliance rules and prior accepted positions, flagging deviations before they reach the AER's formal review stage.
Results
| Task | Before | After Semantica |
|---|---|---|
| Locate AER's position on a specific cost category | 2–4 hours of manual search across both cycle document sets | Under 3 seconds, with exact source citations and page numbers |
| Cross-cycle pattern and delta analysis | Days of manual comparison across thousands of pages by a senior analyst | Single graph traversal query returning allowance history, objection reasons, and delta |
| Onboarding a new regulatory analyst to the full cycle history | 6–12 months to build contextual understanding from scratch | Days: the full institutional memory is queryable and citable from day one |
| Submission team preparation for a new cycle | Rediscovering prior cycle outcomes from scratch each 18-month cycle | Build the 2029–34 proposal on top of indexed, queryable prior history |
| Surfacing recurring AER objection patterns | Not systematically possible, required weeks of manual review to approximate | Automatic: the temporal graph surfaces pattern recurrence across both cycles |
| Estimated regulatory preparation effort saving | Baseline: 18–24 month cycle with full archaeology of prior documents | ~20% reduction in submission preparation effort across the cycle |
What the system can now answer
Who It Helps
Regulatory Submission Teams
Build the 2029–34 proposal on an indexed foundation of prior cycle history. Stop rediscovering what AER has accepted and rejected, and start from a complete institutional picture.
Regulatory Strategy Leads
Identify which AER objection patterns recur across cycles and prepare counter-arguments in advance, informed by the full historical record of what the regulator has and has not accepted.
Expert Witness Consultants
Brief in minutes on AER's documented position history before preparing expert reports. No six-week document review just to establish the baseline before expert analysis begins.
New Regulatory Analysts
Access full institutional memory from day one, not dependent on the one person who was in the room during the 2019–24 determination and remembers the negotiating context.
Conclusion
Ausgrid's regulatory submission process is one of the highest-stakes recurring workflows in Australian energy infrastructure. The team preparing the 2029–34 determination will include people who were not in the room when the 2019–24 objections were negotiated. Without a systematic intelligence layer, institutional knowledge depreciates with every staff change and accumulates nowhere. Semantica does not replace the regulatory affairs team. It gives them memory that does not walk out the door: a queryable, provenance-backed institutional record that makes every prior cycle finding accessible, citable, and usable in the preparation of the next. For any regulated DNSP approaching an AER cycle, the question is no longer whether that institutional memory matters. It is whether it will be there when you need it.
Get in touch
Preparing for your next AER regulatory cycle?
We work with regulated utilities, regulatory affairs teams, and expert witness consultants who need to turn prior cycle documents into queryable institutional intelligence. If the 2029–34 cycle is on your horizon, we want to talk.
Start a conversation