A discrimination tree for the purpose of unifying local expressions with library results.
This data structure is based on Lean.Meta.DiscrTree and Lean.Meta.LazyDiscrTree,
and includes many more features.
New features #
The keys
Key.lam,Key.forallandKey.bvarhave been introduced in order to allow for matching under lambda and forall binders.Key.lamhas arity 1 and indexes the body.Key.forallhas arity 2 and indexes the domain and the body. The reason for not indexing the domain of a lambda expression is that it is usually already determined, for example in∃ a : α, p, which is@Exists α fun a : α => p, we don't want to index the domainαtwice. In a forall expression we should index the domain, because in an implicationp → qwe need to index bothpandq.Key.bvarworks the same asKey.fvar, but stores the De Bruijn index to identify the variable.For example, this allows for more specific matching with the left-hand side of
∑ i ∈ Finset.range n, i = n * (n - 1) / 2, which is indexed by[⟨Finset.sum, 5⟩, ⟨Nat, 0⟩, ⟨Nat, 0⟩, *0, ⟨Finset.Range, 1⟩, *1, λ, ⟨#0, 0⟩].The key
Key.startakes aNatidentifier as an argument. For example, the library pattern?a + ?ais encoded as@HAdd.hAdd *0 *0 *1 *2 *3 *3.*0corresponds to the type of?a,*1to the outParam ofHAdd.hAdd,*2to theHAddinstance, and*3toa. This means that it will only match an expressionx + yifxis indexed the same asy. The matching algorithm ensures that both instances of*3match with the same pattern in the lookup expression.We evaluate the matching score of a unification. This score represents the number of keys that had to be the same for the unification to succeed. For example, matching
(1 + 2) + 3withadd_commgives a score of 2, since the pattern ofadd_commis@HAdd.hAdd *0 *0 *0 *1 *2 *3: matchingHAdd.hAddgives 1 point, and matching*0again after its first appearance gives another point. Similarly, matching it withNat.add_commgives a score of 3, andadd_assocgives a score of 5.Patterns that have the potential to be η-reduced are put into the
RefinedDiscrTreeunder all possible reduced key sequences. This is for terms of the formfun x => f (?m x₁ .. xₙ), where?mis a metavariable, one ofx₁, .., xₙisx, andfis not a metavariable. For example, the patternContinuous fun y => Real.exp (f y)])is indexed by both@Continuous *0 ℝ *1 *2 (λ, Real.exp *3)and@Continuous *0 ℝ *1 *2 Real.exp, so that it also comes up if you look upContinuous Real.exp.How to deal with number literals is waiting for this issue to be resolved: https://github.com/leanprover/lean4/issues/2867
The key
Key.opaqueonly matches with aKey.starkey. Depending on the configuration, β-reduction and ζ-reduction may be disabled, so the resulting applied lambda expressions or let-expressions are indexed byKey.opaque.
Lazy computation #
To encode an Expr as a sequence of Keys, we start with a LazyEntry and
we have a incremental evaluation function of type
LazyEntry → MetaM (Option (List (Key × LazyEntry))), which computes the next keys
and lazy entries, or returns none if the last key has been reached already.
The RefinedDiscrTree then stores these LazyEntries at its leafs, and evaluates them
only if the lookup algorithm reaches this leaf.
Alternative optimizations #
RefinedDiscrTree is a non-persistent lazy data-structure. Therefore, when using it, you should
try to use it linearly (i.e. having reference count 1). This is ideal for library search purposes,
which build the discrimination tree once, and store a reference to the tree.
However, for tactics like simp and fun_prop this is less ideal, because they can't use the
data-structure linearly, since copies of the data structure must regularly be stored in the
environment. For fun_prop this is not a serious problem since it doesn't have that many
different lemmas anyways.
Future work: #
Make a version of RefinedDiscrTree that is optimal for tactics like simp and
fun_prop. This would mean using a persistent data structure, and possibly a non-lazy structure.
Matching vs Unification #
A discrimination tree can be used in two ways: either with (unification) or without (matching)
allowing metavariables in the target expression to be instantiated.
Most applications use matching, and the only common use case where unification is used is
type class search. Since the intended applications of the RefinedDiscrTree currently use
matching, the lookup algorithm is most optimized for matching.
Future work: #
Improve the unification lookup.
Returns candidates from all imported modules that match the expression.
Equations
- One or more equations did not get rendered due to their size.
Instances For
Returns candidates from this module that match the expression.
Equations
- One or more equations did not get rendered due to their size.
Instances For
findMatches combines findImportMatches and findModuleMatches.
extshould be an environment extension with anIO.Reffor caching theRefinedDiscrTree.addEntryis the function for creatingRefinedDiscrTreeentries from constants.tyis the expression type.constantsPerTaskis the number of constants in imported modules to be used for each new task.capacityPerTaskis the initial capacity of theHashMapat the root of theRefinedDiscrTreefor each new task.
Equations
- One or more equations did not get rendered due to their size.