Discussion of the main problems in the project - NLP
Manual formalization versus NLP tools:
There has to be done now (!) the decision, if and from where on, to use NLP tools to formalize natural language sentences in pw with the help of self- developed or open source available tools, to get natural language sentences in pw into propositional logic, predicate logic or other defined structures to be able to compute the consistency of an evaluated set of sentences.
There are tools out there, mostly analyzing these sentences grammatically, sometimes even semantically or even logically (found a french one not fitting here).
1) The crucial question is, if they serve A) to provide even a logical rewriting of sentences, as known from university logic classes and B) if they serve in a format choosen for pw (or in case in a format transferable into our choosen default format).
2) The next question would be, at which point of computation step in pw this should occur ? The problem is, if we process atomic sentences manually and then use a tool, which processes natural language directly into predicate logic, how we map both version, the manual assertional and automated predicate together ? Can we be sure that they map (no !)?
3) Which predicate logic formate shell we target ? (as propositional logic should be much more easier to be made)
As these are complete open questions, see all questions 1) - 3) outlined for the pw Server version to contribute at
pw Server home page
or on:
pw home
NLP supported logical formlization:
Using a tool for natural language analysis, there can be at least found two distinct approaches. The one using extended grammar definitions, covering even many of the exception we face in natural languages. But these grammar definitions get huge and hard to maintain. On the other hand there are statistical approaches based on learning algorithms processed against big text bases (corpora) answering the simple questions: which word orders do occur and how is their probability compared to similar one (without getting grammar rules)? The algorithms are more simple to maintain and results are comparable to the first approach. Perhaps they even match the process better a human beeing is learning a language. Both approaches exist also in combinations.
When trying to formalize natural language sentences via natural language processing automatization tools support, we face following problems then:
A) the tool might follow some strict grammar approch, but to maintain such grammars with all exceptions is not practically solvable/maintainable
B) we use a tool using a statistical approch. Then we get results formalizing sentences according to statistclly based rules (which we ourselves dont understand in every case)
C) ..but which might be not as strict as formal grammar rules, but more the way, in which we learn language rules ourselves in reallity..
D) NLP tools focus on different targets, but we need grammar and logical extratcion. Both are not available 100% together ...
E) Most important, when we formalize atomic sentences for 2BR logics manually and try to use NLP, we end up handling NLP resolving sentences naturally on predicate logic level and trying to map them with our manual apporoach based on our atomic senteces we manually created (see above). Either of both has to be the leader !..but which ? Is there a reasonable soluation for this case?
F) NLP tools resolve the grammatical, and in best cases the sematical resolution of sentences, but in rare cases the logical rewriting issues (remember rewriting natural language sentence in your first logic course...). So what about this? Can we use these tools now or is it for future?
We are now in the process of evaluating or testing some features of open source tools. But we shell get to some conclusion before starting transfer our real sentences into some predicate logics to avoid double work...
EBNF for (predicate) logic syntax check and parsing:
Given a decision for a propositional or predicate logic syntax in PWR, we can go over to the problem to formalaize it. The propositional logic is basically defined beforehand (PWR) but the propositional one for 3BR is not.
If we would have a formalization of the syntax rules at hand for either case, we might use it to check the correct syntax against it before processing or even generate such parser procedures out of the syntax definition.
For both there exists standards and tools. EBNF is the de facto standard to describe formal languages with a formal language (it is quite similar to unix regular expression syntax). And given such a 'grammar' there are tools which generate parsers out of them, so we can use them. See 'AntLR' as a reference.
So where are the problems?
1) To produce such grammar for the given (predicate) logic, at runtime by the user.
2) 1. should be consistent with the 2BR logic syntax. How to verify?
3) (predicate) logic is a more complex language in the chomsky hierarchy. It needs bracketing, which makes it a more complex grammar type (hopefully not 'context sensitive'). We have to assure that this type is not too complicated by our grammer laws, making our grammar not to be able to be generated with a parser generator !
4) Is the user able to upgrade even a given default predicate logic grammar correctly?
5) We have to adhere to the default predicate logic format chosen. But there isn't one. The discussion is still open. Shell we use KIF ? See pwSrv discussion at:
pw Server home page
or on:
pw home
6) The parser generator must not, but should be build with the same programming language that pw will use for 3BR (upgrade) on. Also this is open for discussion now. See the same links...
These discussions are problems at the horizon, but the way we chose to go on, will have a large impact on the project pw in future.
Other topics are not mentioned for now, just some keywords:
- functional analysis in the upgrade rules to 2BR should be implemented, but even with overriding PWRules taken ino account?
- language approaches vs modern model theories (R. Giere)
- a predicate logic vs modell TH ? Can we really expect to built philosphiocal models?
- Translating SE to EN before or after TS formalization? both variants are possible right now, the translation from Lang-X to EN first as a default. But will either variant be ident in the result? ...
Back to first page
Back to main page