Professional Documents
Culture Documents
All cats have four legs. This animal is a cat. (Therefore) This animal has four legs.
Classically, there are umpteen-plumpty (to be precise) ways in which syllogisms can be incorrectly used And only a few correct ways. ;) Go read a Philosophy Primer (polite way of saying Idiots Guide) about all these types of syllogistic errors. Or Google this link :
http://en.wikipedia.org/wiki/Syllogistic_fallacy
The basis of all deductive/propositional logic is that the premises are BINARY, i.e., they can ONLY be TRUE, or FALSE, and no other value is permitted. Frex, a premise cannot be half-true, or half-false. This is formally called The Law of the Excluded Middle
INDUCTIVE LOGIC
http://en.wikipedia.org/wiki/Inductive_reasoning
Whereas DEDUCTION is a logical process where the general leads to the particular, i.e. a major premise leads to MANY particular conclusions, INDUCTION is the process where MANY particular observations lead to a major premise. How many? Go see:
http://en.wikipedia.org/wiki/Sample_size_determination
Thus we are advised to set up sizable test populations, ensure double-blind testing, ascertain means, variances, and correlation coefficients, and only then generalise, or INDUCE a general truth. But then some ninny will cavalierly dismiss such testing by uttering a modern platitude Correlation does not mean Causation. But how did this junior-stats-class admonition become so misused and so widespread? See: http://www.3quarksdaily.com/3quarksdaily/2012/10/why-do-people-love-to-say-that-correlation-does-not-imply-causation.html From this link, note : While correlation does not prove causation, it sure as hell provides a hint. There is also a delightful Mathematical Induction Proof, only for Integers, that IF some theorem is proven for (i) the nth case, and then (ii) the kth case, and as well the (iii) (k+1) case, THEN these mere 3 coupled cases (inductively) prove the general case for an infinity of integers ! See: http://www.mathsisfun.com/algebra/mathematical-induction.html
COMPUTER LOGIC
Output Output
T2
OFF ON
T1
OFF S1
= The total state space for this 2-bit computer.
In computer science, those above logic states are often represented by a Karnaugh Map see adjacent figure, where T1 and T2 represent the possible logic values for transistors T1 and T2 respectively, and S1, S2, S3, S4 represent any of the 4 possible Output Logic States mentioned above. The 4 states combined are also termed The total (logic) state-space available for this computer. One can then easily guess that - by extrapolation beyond this 2-bit computer IF a computer uses 64 bit logic (i.e., 64 output transistors in parallel inside the CPU), THEN there exist = 264 = 20E18 = 20 Exa-logic-state-spaces ! And note, EVERY (logic) variable inside the program has this range of truth choice. Thats certainly quite a bit more than the above simplistic TRUE or FALSE. Thus, between the extremes of Yes & No, a computer can fit such a large range of finely graded answers, that a human could even think that this Digital Computer was not so far different from an Analogue Computer, or from human reasoning !!
S2
ON
S3
S4
We conclude that though individual computer components are certainly bound by BINARY LOGIC, their combined state space logic becomes so much more complex. 1/3 /var/www/apps/conversion/tmp/scratch_6/172252491.doc 10/09/2013
There exist a great many WELL KNOWN probability distributions (profiles) which can be selected for suitable applications. Frex, the Gaussian (or Normal) distribution, is often applicable to biological matters, such as height, weight, IQ HOWEVER, this Gaussian curve is quite inaccurate for adult IQ, because it does not represent the total population, as born. Many handicapped children, with IQ below 85, suffer multiple handicaps, and thus do not survive much beyond age 20. And therefore do not get sampled into the adult profile, but they SHOULD, to properly represent IQ over all ages. To represent such an adult distribution, one could instead employ the F-distribution, where the lower level IQ scores have been deleted. Crudely expressed, one is thus more likely to meet an adult person with IQ well over the average, rather than below! Yeah, I know, all YOUR coworkers are bloody idiots ! One interesting probability distribution is the ball/roller/taper bearing, as often used in conveyor belts, and the like. This distribution is perfectly flat, meaning a bearing under a conveyor belt, pouring mining product into a ships hull, is equally likely to fail 5 minutes after installation, as 50 years after installation. Another is the Electronic Device Bathtub Distribution, where all such devices whether a transistor, or a TV, or a whole TV transmitter, has extremely high rates of failure within the first 10% of its life-time (called burn-in), AND also within the last 10% of lifetime (called burn-out), but will sail along without any trouble for the middle 80% of its lifetime. PUB LOGIC Who has not heard, in the local rubbity, these truths: Everyone is a liar. Its all a gummint conspiracy! An idiot may utter wisdom, and a sage may utter dreck. Ya (no one) nivva knows ! These generally fall into a category known as The Epimenidean Liar Paradox, whose shortest version is: This statement is false. See Graham PRIEST [Univ. of Melbourne] & Liar Paradoxes: http://en.wikipedia.org/wiki/Liar_paradox Along with these dubious fact claims, we hear them backed up with and thats Absolutely true, or just Absolutely. However for any notion to become an Absolute Truth, it would have to attract the concordance of EVERY observer in the Universe. Gathering results from EVERY observer in the Universe, is currently impossible (time & distance). So we could modify, by asking every observer on this planet. Again, practical experience will undoubtedly reveal discordance about most matters. Even ones own family rarely agree to 100% - about most hypotheses. So, Absolute Truths? Not bloody likely !!! 2/3 /var/www/apps/conversion/tmp/scratch_6/172252491.doc 10/09/2013
BAYESIAN LOGIC (also called FUZZY LOGIC (FL), and/or CERTAINTY LOGIC )
Alas, the word Fuzzy can be used as a derogatory term in English; for it can mean any of: imprecise, inaccurate, shoddy, fake, not kosher, not hallal, etc. But it has now become a serious and respectable word in Maths and Engineering. Lotfi ZADEH, the creator of Fuzzy Logic, established full academic credentials for this discipline, in about 1970. Soon after, Matsushita Electric Company adopted his ideas, and used FL in : Dishwashers, Washing Machines, Air Conditioners, Microwave Ovens, Cameras, Camcorders, TVs, Copiers. Further, it would be safe to assert that nowadays - most upmarket luxury cars all utilise FL in their automatic gearboxes, for selecting gear ratios, estimating the time of changing gear, disengaging and engaging dual clutches, and limiting the torque applied to any particular gear. Real-World-Vagueness vs Digital-World-Crispness Every day language abounds with vague and imprecise concepts, such as Sally is tall, or It is very hot today. But crisper (scientific) language might say Sally is 152cm 0.5cm high, or Todays temperature is 30C 0.2degree. FL provides a scheme for dealing with the imprecision that is so often used by humans. The central notion of FL is that all truth-certainty values fall within the range from 0 to 1. This is quite alarming for those brought up with the well honoured Law of the Excluded Middle. One should note that FL subsumes Boolean Logic, because 100% certainty = TRUE, and 0% certainty = FALSE. But FL also permits certainty values in-between TRUE and FALSE. One should also note that certainty is not the same as probability. Let me explain it this way. Say there is some new scientific phenomenon, for which there is no fully established explanation yet. Let us postulate that there exist 10 separate hypotheses to explain this phenomenon, but there exists not a lot of experimental data. Bayesian Hypothesis Selection/Induction Given these 10 contending hypotheses, BHS asserts that we must equally apportion initial allocated Certainty (technically called informatic entropy) between all these hypotheses, both new and old*. Thus h0 is given certainty 0.1, h1 certainty 0.1, and so on until we attribute h9 with certainty = 0.1. *[The fact that older accepted hypotheses have to start from scratch again, denies that The burden of proof lies only on new contenders.] Experiments are then undertaken, and normally some hypotheses acquire additional confirming evidence, and others may acquire none (zero) at all, or even disconfirming (negative) evidence. Fortunately, older hypotheses can bring with them a substantive legacy of previous experimental evidence. At a time later we may see the initial certainty profile evolve to: h0 = 0.7, h1 = h2 = 0.1, h3 = 0.8, h4 = 0.1, and so on. Note that the sum of all certainties does not have to equal 1.0, as is so for probabilities. Clearly h0 and h3, as they stand here, are (almost) equally good hypotheses. How do we choose between equally GOOD competing explanations? There are several strategies used to resolve such a contest : 1. We say we dont know which hypothesis is best, and use either hypothesis h0 or h3 as suits circumstances. 2. We suspend implementing any hypothesis, and say more evidence is needed. 3. We demand that any successful hypothesis must surpass its competitors by a large amount (e.g. the 5 Sigma Proof *, in Physics). * See: http://www.telegraph.co.uk/science/large-hadron-collider/9374758/Higgs-boson-scientists-99.999-sure-God-Particle-has-been-found.html Note these logic certainties do not at all correspond to the probabilities of gaming theory or statistics. Certainty measures the degree of belief in a probability outcome, using either 0 100% certainty, or (-)100 certainty 0 (+)100% certainty; whereas Probability measures how often a particular outcome occurs, as a number from 0 100%. Each and every time a probability figure disagrees with factual reality, then the certainty in that probability is reduced. Certainty incorporates past performance, whereas Probability (e.g. the tossing of a coin) is said to be the same with each test, no matter how many heads or tails appeared in the past. A very rough analogy may help understanding here: In acoustic engineering, humans ears can hear/tolerate an acoustic loudness level from: 10dB 110dB (a unit that can be converted to Watts/m^2). But if the sound is speech or music, then no matter how loud/quiet this sound may be, the message becomes incoherent if Signal-to-Noise-Ratio (SNR - a ratio of signal power to noise power) falls below approximately 5dB. One may crudely analogise Probability to Loudness, and Certainty to SNR, two different but important aspects of logic, and two different but important aspects of sound. How does one enumerate a Certainty estimate? Heres one (junior) methodology: If statistical theory/hypothesis predicts a numerical outcome O1, and laboratory mensuration gives a numerical outcome O2, then the fractional error/disagreement can be found from : |(O1 O2)/O1| = O_error. A number of people, and labs, then perform this exercise as well. Then sum all these errors = O_sum. The Certainty in hypothesis O, is then (1 O_sum). The Bayesian Formula/Law/Rule, as used in this process of certainty/statistical induction, happens to be a monster of a formula. See: Annals Of The New York Academy Of Sciences 2011 Issue: The Year in Cognitive Neuroscience - Bayesian Models: The Structure Of The World, Uncertainty, Behavior, And The Brain Iris VILARES, and Konrad KORDING Bayesian Statistics gives a systematic way of calculating optimal estimates based on noisy or uncertain data. Models in Bayesian statistics start with the idea that the nervous (many) systems need to estimate variables in the world that are relevant (x) based on observed information (o), (typically coming from our senses). BAYES RULE then allows calculating how likely each potential estimate x is, given the observed information o: p(x |o) = p(o| x) . p(x) / p(o). For example, consider the case of estimating if our cat is hungry (x = hungry) given that it meowed (o = meow). If the cat usually meows if hungry (p(o|x) = 90%), & is hungry a quarter of the time (p(x) = 25%), and does not meow frequently (p(o) = 30%), then it is quite probably hungry when it meows (p(x|o) = 90%25%/30% = 75%). My paper here is NOT about the intricacies of such stats, worthy as they be. So a rough analogy may then suffice. Given the toss of a (fair) coin, a STATISTICAL premise asserts that each and every toss of that coin has equal probability of turning up heads, or tails, no matter how many heads or tails have preceded the current event (toss). But BAYES says otherwise, that prior hypotheses can be updated, by merging prior hyp/stats with new hyp/stats, to give better hyp/stats. This is now the basis of all modern Scientific Induction. The final result of this Bayesian iteration, is termed (via Elizer YUDKOWSKI) as not absolute truth, but as the least wrong truth. 3/3 /var/www/apps/conversion/tmp/scratch_6/172252491.doc 10/09/2013