A library for Distributional Formal Semantics (DFS).
The following predicates are re-exported from other modules
| (Pr(P|Q) - Pr(P)) / (1 - Pr(P)) iff Pr(P|Q) > Pr(P) inf(P,Q) = | | (Pr(P|Q) - Pr(P)) / Pr(P) otherwise
Pr(P&Q) = sum_i(v_i(P&Q)) / |M|
H(w_i) = - sum_(foreach s in S) Pr(s|v(w_1...i)) * log(Pr(s|v(w_1...i)))
where v(w_1...i) is the disjunction of all semantics consistent with the prefix w_1...w_i.
S(w_i+1) = -log(P(w_i+1|w_1...i)) = log(P(w_1...i)) - log(P(w_1...i+1)) = log(freq(w_1...i)) - log(freq(w_1...i+1))
Dimensions # # BeginItem Name "sentence" Meta "semantics" Input # # # Target # # Input # # # Target # # EndItem
For each atomic proposition, the input vector is the zero vector with number of dimensions equal to the number of words produced by dfs_words/1, and the target vector is the vector encoding the atomic proposition.
surprisal(P,Q) = -log Pr(P|Q)
Pr(P|Q) = Pr(P&Q) / Pr(Q)
Pr(P) = sum_i(v_i(P)) / |M|
DH(P,Q) = H(Q) - H(P)
S(w_i+1) = -log(Pr(v(w_1...i+1)|w_1...i))
where v(w_1...i) is the disjunction of all semantics consistent with the prefix w_1...w_i.
H(P) = - sum_{s in S} Pr(s|P) * log Pr(s|P)
where the set S consists of all possible points in the DFS space that are fully specified with respect to the atomic propositions; that is, each point s in S constitutes a unique logical combination of all atomic propostions.
Word vector representation (WVecs) are pre-specified (e.g., using dfs_localist_word_vectors/2), and a vector representation of the sentence meaning, specified by the formula Sem, is derived from ModelSet.
Pr(P|Q) = Pr(P&Q) / Pr(Q)
Pr(P) = sum_i(v_i(P)) / |M|
DH(w_i+1) = H(w_i) - H(w_i+1)
H(P) = - sum_{s in S} Pr(s|P) * log Pr(s|P)
where the set S consists of all possible points in the DFS space that are fully specified with respect to the atomic propositions; that is, each point s in S constitutes a unique logical combination of all atomic propostions.
Pr(P&Q) = sum_i(v_i(P&Q)) / |M|
| (Pr(P|Q) - Pr(P)) / (1 - Pr(P)) iff Pr(P|Q) > Pr(P) inf(P,Q) = | | (Pr(P|Q) - Pr(P)) / Pr(P) otherwise
Dimensions # #
where the first '#' is an integer indicating the input vector size, and the second '#' an integer indicating target vector size. The remainder of the file consists of item blocks:
Sentences format:
BeginItem Name "sentence" Meta "semantics" Input # # # Target # # Input # # # Target # # EndItem [...]
where 'sentence' is a sentence, and 'semantics' its formatted FOL semantics. The '#'s are the integer units of the input/target vectors.
Discourse format:
BeginItem Name "sentence1 #### sentence2" Meta "semantics1 #### semantics2" Input # # # Target # # Input # # # Target # # EndItem [...]
where 'sentence1' and 'sentence2' are sentences of a discourse, and 'semantics1' and 'semantics2' the (possibly) varying semantics for the two sentences. The single '#'s are the integer units of the input/target vectors, and the '####' string is a sentence divider.
H(w_i) = -sum_(w_1...i,w_i+1...n) Pr(w_1...i,w_i+1...n|w_1...i) * log(Pr(w_1...i,w_i+1...n|w_1...i))
DH(w_i+1) = H(w_i) - H(w_i+1)
DH(P,Q) = H(Q) - H(P)
surprisal(P,Q) = -log Pr(P|Q)
Word vector representation (WVecs) are pre-specified (e.g., using dfs_localist_word_vectors/1), and a vector representation of the sentence meaning, specified by the formula Sem, is derived from ModelSet.