You're pledging to donate if the project hits its minimum goal and gets approved. If not, your funds will be returned.
To address the black-box problem of existing large models, reduce training costs, reduce hallucinations in large model inference, and break the scaling law, we designed and developed our large language model which is an innovative Explainable LLM project and is called EPMORE(Explainable Process Mixture-of-Experts).
EPMORE consists of four major components. SPE (Simple Position Encoding) achieves disentanglement between position and content. EPL (Expansive Processing Logic) increases feature capacity by dynamically expanding from extremely low dimensions, making feature disentanglement possible. PIL (Parameter Independence Loss) in the architecture decouples the factor entanglement in weight matrices, thereby achieving feature disentanglement. The MOR (Middle Output Reuse) mechanism allows tokens whose intrinsic dimensionality increases during inference to align with tokenizers at different levels of abstraction.
This project's goals:
1. Design and develop explainable large language models;
2. Explore explainability mechanisms;
3. Enhance the reliability of large language models,
4. Reduce training costs.
Current Status and Plan:
1. Our team has already developed the first version of the product.
2. We are currently training the model using the company's 5 GPUs.
3. Our project will reduce the cost of training LLM.
4. We are raising funds to rent more GPUs for large-scale training.
5. We will provide an online product of our LLM.
6. Our project also belongs to the field of AI safety and interpretability.
Funding Breakdown:
1. EPMORE Model Training, renting GPUs - $30,000 - $60,000
2. Salaries for 5 team members for 3 months - $9,000 - $18,000
3. Testing and QA tools/licenses - $7,000 - $14,000
4. Project management, operation, and misc costs - $4,000 - $8,000
Total: from $50,000 to $100,000
Below is our team members:
Wei Sheng, a Principal Technical Director with over 20 years of experience in computer network program development and has more than 5 years' experience on the field of large language model, Pytorch, and Transformer to ensure the quality of LLM training;
Lunhao Ao, a Ph.D. graduate from the University of Missouri-Columbia (USA) who has undergone 11 years of academic training in mathematics, and an senior algorithm engineer with over 5 years of Java and Python development experience in the Internet industry, for optimizing model performance and exploring model interpretability;
Qihua Zhang, a principal testing and data processing expert with 15 years of experience;
Junyan Long, a master's graduate from City University of Hong Kong with an outstanding academic record in large AI models, and 1 year of work experience;
Yu Ye, a bachelor’s graduate in Communication Engineering with 2 years' experience on Data Processing and Cleaning.
All of the above five are from Mandelbrot Enterprise Management Consulting Enterprise (Limited Partnership).
We may continue limited work on the project without Manifund, and receiving Manifund support would allow us to scale up our models, improve explainability mechanisms, accelerate training, and significantly increase the impact of the project.
In the last 12 months, our team has received an angel investment of $294,334 from Shenzhen Ruida Technology Co., Ltd.
There are no bids on this project.