You're pledging to donate if the project hits its minimum goal and gets approved. If not, your funds will be returned.
Vinte is a custom-from-scratch LLM architecture trained on consumer hardware. The architecture combines a novel single-momentum trust-amplified optimizer (FLO), parallel-scan recurrent state (GLR), content-addressable gated feedforward routing, and a memory-binding FiLM head. Compute is sub-quadratic in sequence length, the recurrent state is fixed-size with no growing key-value cache, and the entire pipeline runs end-to-end on a single rented consumer GPU over SSH.
Open weights, open code, planned open release with a free public website hosted by Vinte, so people without large GPU rigs can still use the model.
The active research bet is whether a from-scratch novel architecture can be trained to coherence on consumer-scale data and hardware by a single operator. Most active work in alternative architectures right now is happening at well-funded labs with frontier-scale compute, or via attention surgery on already-pretrained transformers. Vinte fills the niche neither of those covers: open, single-operator, audit-able by design, no inheritance from someone else's pretraining run.
Goal 1: Ship the first end-to-end Vinte chat model. Two-phase training (next-token pretrain on plain text, then SFT) is currently in progress on a 100M-parameter model. Final evaluation runs after this application is submitted. Open weights, sample outputs, evaluation report.
Goal 2: Train a 1B-parameter Vinte model properly. The training infrastructure is already validated at 1B scale, including parity gates, observability logging, and resume-from-crash support. Estimated compute: 50 to 100 dollars of consumer-GPU time for the full two-phase pretrain plus SFT.
Goal 3: Run the parental-evolution validation experiment. Vinte ships with a "mistake bank", an append-only catalogue of every observed failure of every model in the lineage, used as a hard auxiliary-loss constraint on future training runs. Cross-generation alignment via accumulated negative examples. Validation: ingest 30 to 50 of v1's actual failure outputs, train a v2 with the bank as auxiliary loss, measure whether v2 specifically avoids v1's failures without regressing on general loss.
Goal 4: Write the architecture paper and post to arXiv. Roughly 10 pages covering FLO, GLR, gated FFN, FiLM, the mistake bank, and the v3 marginal-chasing finding (a documented negative result). The paper is what makes Vinte citable and connects the work to the broader research community.
How: the infrastructure is already built. Funding unblocks compute time to run experiments and writing time to ship the paper.
100 percent of the grant is GPU hours on rented consumer-GPU instances at 0.342 dollars per hour. No salary, no overhead, no equipment.
- 1B two-phase pretrain plus SFT: ~80 dollars
- v2 with mistake-bank validation experiment: ~50 dollars
- 200M and 300M scaling-line data points: ~150 dollars
- Retries, exploration, paper-supporting ablations: ~220 dollars
Total: ~500 dollars at the minimum funding level.
If funded above 500 dollars: longer-context evaluation, parallel-trace inference experiments, and a 200M chat model with the mistake bank applied from the start.
Independent researcher, self-taught, no formal institutional affiliation. Vinte is the result of months of solo work outside of school. No prior grants, no published papers yet.
Most likely failure: the 1B model continues marginal-chasing even with the two-phase pretrain. This would mean the architecture has a training-dynamics issue that needs more research before scaling beyond 100M. The funding still produces value because the negative result is documentable and publishable. The mistake-bank infrastructure, the search service, the evaluation tools, and the cognitive-overwhelm research methodology all ship regardless. The negative result itself is a contribution.
Second failure: the 1B model trains successfully but the open-weights release gets minimal community traction. Partly outside my control. Mitigation is the arXiv paper, which gives the work a citable form even without viral attention.
Third failure: I run out of budget before training the 1B model and the project pauses. This is exactly what the funding is for.
I do not foresee safety-relevant failure modes. The model is too small (1B max) and too narrow (SFT chat) to pose meaningful misuse risk.
Zero from external sources. The project is entirely self-funded. No employer or institutional sponsor.
There are no bids on this project.