Introduction to finalfusion
Features
finalfusion is a file format for word embeddings and an associated set of libraries and utilities. The file format has the following features:
- Word and subword vocabularies.
- Regular and quantized embedding matrices.
- Memory mapping of embedding matrices.
We also provide finalfrontier to train finalfusion embeddings.
Getting embeddings
- We provide a growing set of pretrained embeddings.
- We also provide conversions of the fastText Wikipedia and Common Crawl embeddings.
- You can use finalfrontier to train your own finalfusion embeddings.
Libraries
finalfusion libraries are available for:
Specification
If you are interesed in implementing your own library for the finalfusion format, please see version 0 of the finalfusion specification.
Acknowledgements
finalfusion and finalfrontier were developed by Daniël de Kok and Sebastian Pütz. Financial support for research and development of this software was provided by the German Research Foundation (DFG) as part of the Collaborative Research Center “The Construction of Meaning” (SFB 833), project A3.