You're pledging to donate if the project hits its minimum goal and gets approved. If not, your funds will be returned.
The Conflict Intelligence Platform (CIP) is an open-source system that transforms multilingual conflict reporting (Hebrew, Arabic, Persian, English) into structured, source-traceable intelligence for NGO analysts at organisations like the International Crisis Group or Amnesty International. The v0 scope is the Iran conflict context.
CIP is built around a hard separation between an Admin world (the technical operator who acquires, ingests, and calibrates data) and an Analyst world (the domain expert who uses a pre-calibrated dashboard). The two worlds communicate only through three versioned contracts: a Data Format contract, a Graph Schema contract, and a Configuration contract. This separation makes the analyst's experience clean and the analytical outputs reproducible.
The grant requested here funds a small pilot build of the acquisition pipeline (SP2) and a prototype of the extraction-and-ingestion pipeline (SP3) on a narrow slice of Iran-context data, sufficient to demonstrate end-to-end provenance from a Persian-language source to a calibrated, exportable analyst finding.
The strategic goal is to give NGO conflict analysts an analytical surface they can actually trust — where every finding traces back to a named source, every analytical method is calibrated transparently for the conflict context, and nothing is auto-published.
Existing tools either treat conflict data as a generic BI problem (no provenance, no source-reliability scoring, no actor-relationship modelling) or are closed commercial platforms inaccessible to small NGOs.
The concrete v0 pilot goal is to ship a working end-to-end demonstration on Iran-context data: a Persian, Arabic, Hebrew, or English source flows through acquisition, translation, entity extraction, curation, ingestion into SQLite/Neo4j, and surfaces in the analyst dashboard as one of six named analytical capabilities (escalation detection, actor network clustering, narrative shift detection, geospatial hotspots, actor centrality, or source reliability audit). Every step is reproducible because parameters and calibration configs are versioned.
How I'll get there in the pilot: The pilot leverages 18 months of accumulated planning work. The repo already contains three formalised contracts (a canonical schema artifact CIPSchemaArtifact-v1.0.0.yaml: 7,526 lines, 24 node types, 147 edge types), 220 logged architectural decisions, full operational design for SP3 extraction/curation/ingestion across six specifications, and implementation-readiness specs for both SP2 and SP3.
The pilot will deliver, in order:
1. The SP2 acquisition pipeline against 40 direct source feeds (e.g. ACLED, one Persian-language outlet, one Hebrew-language outlet)
2. A working SQLite-as-system-of-record / Neo4j-projection storage layer per the dual-layer architecture.
3. Extraction and curation against the canonical schema for a few hundred events
4. One calibration notebook for one of the six analytical capabilities
5. A minimal Phase 4 dashboard slice showing the result with full source provenance
The $5,000 minimum funds a 3-month pilot. Approximate allocation:
LLM API costs (translation & extraction): $1,500. Routed through OpenRouter so providers can be swapped without rewriting calling code. Hebrew/Arabic/Persian/English (HEARFAEN) translation dominates spend.
Solo time buy-down: $2,500 — 50 hours at $50/hr covers the implementation work that planning artifacts already specify.
Compute & hosting: $300 — Small self-hosted VPS for SQLite/Neo4j, cron-scheduled acquisition over 3 months. v0 is single-operator self-hosted by design.
Data source access & licensing: $500 — ACLED is free for academic/non-commercial use; some Tier-2 sources may have per-query or subscription fees.
Tooling & domain: $200 — Domain registration, error monitoring, miscellaneous SaaS.
Total minimum: $5,000
If the funding goal of $15,000 is reached, the additional $10,000 extends the pilot to 6 months with the following added scope:
- All five SP2 acquisition cadence groups operational (rather than 2-3)
- Calibration notebooks for at least three of the six analytical capabilities
- A polished analyst-facing dashboard slice usable by at least one NGO partner
- Documentation sufficient for a second technical operator to reproduce the build
I am building this solo. I'm Muhanad Abulhusn, a social data analyst based in the Netherlands working at the intersection of AI integration, data science, and sociopolitical research.
Track record on this specific project: Eighteen-plus months of independent design work has produced the following planning artifacts:
- Three formalised contracts with versioned schemas (Data Format Contract, Graph Schema Contract, Configuration Schema Contract)
- A canonical schema artifact CIPSchemaArtifact-v1.0.0.yaml covering 24 node types and 147 edge types with full property and embedding configuration
- 220 logged architectural decisions with rationale tracking, including a deferred-implementation-guidance section preventing scope creep
- Full operational specifications for Subproject 3 across six documents (extraction pipeline architecture, entity resolution and assertion-history model, storage layer DDL, curation operations manual, quality-and-error-handling protocol, testing and validation plan)
- Implementation-readiness sprints completed for both SP2 acquisition (3 sessions, DEC-207–215) and SP3 extraction/curation/ingestion (4 sessions, DEC-197–206), producing implementation-ready specs that pin Python version, package manager (uv), repo layout, LLM abstraction (OpenRouterClient), logging conventions, and credential management
The honest framing: I am a strong planner and a competent implementer. I do not yet have a shipped public project at this scale. The pilot is partly an evaluation of whether the planning artifacts I produced are actually implementable on the timeline they imply.
Most likely failure mode 1: scope-for-solo mismatch. Even at the trimmed pilot scope, the v0 design spans data acquisition, multilingual NLP, dual-store ingestion, calibration notebooks, and a dashboard slice. A solo operator on a 3-month timeline may ship 60–70% of the pilot rather than 100%. Outcome: the funds still produce a publicly documented partial build with reusable components (the SP2 acquisition pipeline, the SQLite storage layer, the schema artifact) that another team could continue. The planning artifacts remain valuable regardless.
Most likely failure mode 2: calibration produces uninteresting results on a small dataset. The six analytical capabilities (clustering, hotspot detection, centrality ranking) need enough data density to produce meaningful outputs. A 3-month pilot against 2-3 sources may yield clusters that are degenerate or hotspots that are statistical noise. Outcome: the pilot publishes negative findings about minimum-data-density thresholds for conflict-analysis methods — which is itself useful methodological work.
Most likely failure mode 3: NGO adoption is uncertain. Even with a working dashboard, getting an organisation like ICG or Amnesty to actually pilot CIP requires relationships I do not yet have. Outcome: the pilot lands as open-source infrastructure that smaller research outfits, journalists, or academic teams can adopt without a formal partnership.
Lower-probability but worth naming:
- LLM API cost overruns on translation if Persian/Arabic volume is higher than estimated
- Legal/ToS questions around scraping certain news sources
- The dual-layer SQLite/Neo4j architecture may surface unforeseen consistency issues at scale
What I am explicitly not worried about: scope-rot during the build. The planning artifacts have already absorbed the scope-creep cycles — the decision log shows where adjacent ideas (MCP-based acquisition, AI research services, X API access) were deferred from v0 (DEC-221).
$0. No prior cash funding for CIP. All planning work to date has been self-funded and done on personal time.
There are no bids on this project.