# Refactor Fe Compiler(Implementation Details) ## File Id TBW ## Parser/AST AST is not managed by `salsa` since AST is changed everytime a user modifies a source code. The implementation and api would be straight forward since [`Rowan`](https://github.com/rust-analyzer/rowan) will guide us. This is an example of `rowan::ast::AstNode` to a `FunctionWithBody`. ```rust= type SyntaxNode = Rowan::SyntaxNode<FeLanguage> struct Function { syntax: SyntaxNode } impl rowan::ast::AstNode for FunctionWithBody { fn can_cast(kind: SyntaxKind) -> bool { kind == Self::Kind } fn cast(kind: SyntaxKind) -> bool { if Self::can_cast(syntax.kind()) { Some(Self { syntax }) } else { None } } fn syntax(&self) -> &SyntaxNode { &self.syntax } } impl Function { fn name(&self) -> Option<Ident> { // `Ident` impls `AstNode`. rowan::support::child(self.syntax()) } fn param_list(&self) -> Option<ParamList> { // `ParamList` impls `AstNode`. rowan::support::child(self.syntax()) } fn body(&self) -> Option<Block> { // `Block` impls `AstNode`. rowan::support::child(self.syntax()) } ... } ``` ## HIR Analysis ### HIR Representation HIR representation should be independent of source locations to avoid query recomputations just because the corresponding locations are changed. To achieve this, we have to define `BidirectionalMap` that maps `rowan::SyntaxNode`-> `LocErasedNode` and `LocErasedNode` -> `rowan::SyntaxNode`. The map is created in the hir expanding phase which we'll see later. Please refer to [`rust-analyzer/ast_id_map.rs`]( https://github.com/rust-lang/rust-analyzer/blob/master/crates/hir-expand/src/ast_id_map.rs ) to see how they represent location erased syntax nodes. NOTE: Expressions and statements in body are not managed by salsa since they are not depended by other items. The example of `HIR nodes` representation would be like below. ```rust= // ---- HIR definitions ---- #[salsa::tracked] pub struct Ingot { modules: Vec<Module> kind: IngotKind, } pub enum IngotKind { Internal, External, } #[salsa::tracked] pub struct Module { attrs: Vec<ModuleAttribute>, items: Vec<ModuleItem>, ast: LocErasedAst, ... } #[salsa::tracked] pub struct ModuleItem { #[salsa::id] ident: Ident, kind: ItemKind, ast: LocErasedAst, } pub enum ItemKind { StructDef(StructDef), Function(FunctionSig), FuncionWithBody(FunctionWithBody), ... } #[salsa::tracked] pub enum StructDef { name: Ident fields: Vec<FieldParam>, ast: LocErasedAst, } pub struct FieldParam { name: Ident, // `TyPrecursor` represents a type which is not fully resolved. ty: TyPrecursor, ast: LocErasedAst, } pub enum TyPrecursor { kind: TyPrecursorKind ast: LocErasedAst, } pub enum TyPrecursorKind { Path(Path), Array(ArrayTyPrecursor) Tuple(TupleTyPrecursor), ... } #[salsa::tracked] pub struct FunctionSig { name: Ident, params: Vec<FieldParam>, ast: LocErasedAst, } #[salsa::tracked] pub struct FunctionWithBody { sig: FunctionSig body: Body, ast: LocErasedAst, } #[salsa::tracked] pub struct FunctionBody { stmt_arena: Arena<Statement>, expr_arena: Arena<Expr>, ast: LocErasedAst, } pub struct Ident { s: InternedString, ast: LocErasedAst, } #[salsa::interned] pub struct InternedString(SmolStr); pub type StmtId = (FunctionSig, StmtId); pub type ExprId = (FunctionSig, ExprId); pub struct AstIdMap { map: BiDirectionalMap<LocErasedNode>, ast::SyntaxNode> } ... // ---- Query for lowering ---- #[salsa::tracked] fn lower_module(db: &dyn HirDb, module_: LocErasedModule) -> (hir::Module) { ... } #[salsa::tracked] fn lower_function(db: &dyn HirDb, func: ast::Function) -> Either<FunctionSig, FunctionSigWithBody> { ... } ``` ### Name resolution Name resolution will run on a whole ingot since all modules information are necessary for name resolution. The invariants of the name resolution are 1. Name resolution should report unresoled name errors. 2. Name resolution should report name collision errors. 3. Name resolution should return a partial map even if it fails name resolution. See [rust-analyzer/hir_def/namers](https://github.com/rust-lang/rust-analyzer/tree/master/crates/hir-def/src/nameres) and [item_scope.rs](https://github.com/rust-lang/rust-analyzer/blob/master/crates/hir-def/src/item_scope.rs) for more detail. ```rust= // DefMap #[salsa::tracked] struct NameResolvedMap { modules: Map<hir::Module, ModuleScope>, ... } #[salsa::tracked] fn resolve_name(db: &dyn HirDb, ingot: hir::Ingot) -> NameResolvedMap { //.. } #[salsa::tracked(return_ref)] fn module_scope(db: &dyn HirDb, module: hir::Module) -> &ModuleScope { &self.resolve_name.modules[module_id] } #[salsa::tracked(return_ref)] fn body_scope(db: &dyn HirDb, func: FunctionWithBody) -> &FunctionScope { ... } ``` ### Function Body Analysis #### Type Checking/Type Inference All type errors should be reported in the query. I'm inclined to adopt (Generalizing Hindley-Milner Type Inference Algorithms)[http://www.cs.uu.nl/research/techreps/repo/CS-2002/2002-031.pdf] to emit precise errors. ```rust= #[salsa::tracked(return_ref))] pub fn infer_body_type(&self) -> &TypeMap { ... } pub struct TypeMap(Map<ExprId, Ty>); ``` #### Trait Solving @brock or chalk? #### Match Analysis The current implementation can be used with small modification. ## HIR-expansion Although we don't have procedural macro implementation, we already have some attributes in the compiler, and we are going to add attributes which works as if built-in proc-macros. See [RFC: https://notes.ethereum.org/rMw4Yj9mQumdJlAiTcMEIg]. The important design we should take here is proc-macro-like attributes shouldn't affect HIR analyzation phase. So we have to implement pseudo expansion components which expand these macros. TBW. ## MIR ### MIR Representaion I think we can use the current MIR implementation with the below fixes. * Use `hir::Ty` and remove `mir::Ty`. * TBW ### MIR-analysis #### Uninitialied variable analysis It's quite simple. #### Lifetime checker Use [Polonius](https://github.com/rust-lang/polonius) if necessary. ### MIR-optimization TBW. ## Codegen ### CodeGenUnit(CGU) `CGU` is a minimum unit which is corresponding to `sonatina::Module`. Each `CGU` doesn't depend on other CGUs so sonatin can allow #### Monomorphization ## Driver ### Input ## Standard library ### Context Context works as a "feature gate" to work on block-chain specific features. TBW. ### Math TBW. I hope this should be easy to implement when generics/trait/projection are properly defined. ### Encode TBW. I hope this should be easy to implement when generics/trait/projection are properly defined. ### Intrinsics ## Testing TBW.