Finding correspondence between Wittgenstein's Tractatus and programming.

About 10% through reading Tractatus, I realised that many statements that Wittgenstein is making have a lot of sense if interpreted in a programming context.

This file is the orgification of his book.

From the very beginning, from the first footnote, it stroke me as extremely fitting into an "org-like" format of a tree of thoughts. I decided to convert it into an org-mode file, with every thought represented by a heading, and every leaf being my comment on what this leaf actually means. All Wittgenstein's thoughts are represented as headings, and mine are contained within bodies. All my thoughts follow the "one sentence – one line" rule. Every Wittgenstein's thought is at a single line, but his thoughts may consist of several sentences.

Knowing the Lisp's saying "top level is hopeless", I took the liberty of adding this explanatory comment at the top level, in a hope that if someone is going to parse this file with an automated parser, this comment would be easier to bypass.

This file is based on the 1974 edition. Tractatus was originally written in German, in 1921, and translated into English in 1922.

References:

1. 1 The world is all that is the case.

What is "the world"? And what is "the case"? Also, what is "all"?

1.1. 1.1 The world is the totality of facts, not of things.

Okay, the simple thing is: "the world" here is a "logical world", that is something that exists in a computer that processes this "something". Let us throw away a bit of idealism and admit: people are stupid and forgetful, and philosophy is a loose field of research. If we want to make any sense of this book, it way more applies to computers, than to people.

Hence, I should probably be saying "memory" instead of "the world". (Maybe, "storage" would be even better, but let's get back a bit of our idealism and imagine a computer with fast storage.)

"Facts" are, therefore, what we nowadays call "data" (plural).

1.1.1. 1.11 The world is determined by the facts, and by their being all the facts.

A computer cannot get out of it's memory. No matter how and what you program, from the programming perspective it is only the state of memory that is changing.

Moreover, if we state that the memory (or at least some part of the memory) is immutable and large enough to have everything that we would possibly ever be asking from sensors, then we can remove the sensors from our logical system entirely, without loss of generality. That is especially true if we do not allow Random Access Memory, but make the machine move the reading head at finite speed.

1.1.2. 1.12 For the totality of facts determines what is the case, and also whatever is not the case.

So, since we have some stuff written at our machine's memory by default, and no more "external" data can be added, then everything that can be computed, must be computed from the existing data.

Important 1:: we can, obviously (not obviously at all!), generate some random garbage and write it into our memory. And using this random garbage, we can compute (predict) everything. However, we the programmers (logicians) are usually interested in those computation results (logical inferences) that actually make some sense, not random garbage.

So the computation we are making must not contradict what is already in the memory to be useful.

Important 2: maybe we cannot even generate randomness, can we? We can use a PRNG to generate pseudo-random bits, but a PRNG needs a seed, and the only place we can take this seed from is, again, the initial state of the memory.

1.1.3. TODO 1.13 The facts in logical space are the world.

I do not understand this. Is it a repetition of the premise that there is no IO, without loss of generality?

Also, I very strongly feel that I should somehow connect this thought with the "open world" and "closed world" metaphor in inference engines (such as Prolog and SQL), but I'm too ignorant for that.

1.2. 1.2 The world divides into facts.

Memory consists of cells.

1.2.1. 1.21 Each item can be the case or not the case while everything else remains the same.

Cells are bits, and can be either 1 or 0.

(In programming we usually use bytes or words as a minimal elementary operating unit, but perhaps bits can also work as a substrate.)

2. 2 What is the case —a fact— is the existence of states of affairs.

I do not understand. Does he mean that ones in memory should correspond to "true" things in the real world?

2.0.1. 2.01 A state of affairs (a state of things) is a combination of objects (things).

So, this "state of affairs" is that initial memory state that somehow describes the "real world" whatever that be. I am not very sure, but it seems to me that the actual thought here is that this "input" should be partition-able into pieces of input describing various things in the world. (Not sure whether this partition-ability is obvious.)

2.011 It is essential to things that they should be possible constituents of states of affairs.

Does it mean that the input (world) should not be self-contradictory? For example, input should not contain both x=1, and x=2 in whatever kitchen arithmetic we may reason.
2.012 In logic nothing is accidental: if a thing can occur in a state of affairs, the possibility of the state of affairs must be written into the thing itself.

I think this means that "types exist". Wow, that's a grandiose statement, isn't it?

Maybe, it, rather means that "bytes do not mean anything by themselves, but only when there is some human understanding of what these bytes represent. This still implies "types", but in a less mechanistic way.
1. TODO 2.0121 It would seem to be a sort of accident, if it turned out that a situation would fit a thing that could already exist entirely on its own. If things can occur in states of affairs, this possibility must be in them from the beginning. (Nothing in the province of logic can be merely possible. Logic deals with every possibility and all possibilities are its facts.) Just as we are quite unable to imagine spatial objects outside space or temporal objects outside time, so too there is no object that we can imagine excluded from the possibility of combining with others. If I can imagine objects combined in states of affairs, I cannot imagine them excluded from the possibility of such combinations.
  What does it even mean "a situation would fit a thing"?
  
  Okay, I am not all sure in what I am writing here, but here is my view on this:
  1. So, your input is encoding some set of affairs.
  Some things are in some state, some other things are in some other state. Firstly, you should be able to just append an encoding of a state of some thing to the end of the input. Because why not?
  1. On the other hand.
  Assume types exist. Then instead of doing a computation on objects, you can do a computation on types, and create a mapping from a set of all possible inputs (matching your expected types) to a full set of all possible outputs. For "computation" (narrowly understood) that is probably not feasible, as it would blow up exponentially with every input variable, but for "logic" as the science of "all truths", computational inefficiency should not matter.
  
  Make a pull request with your own understanding of what this means.
2. 2.0122 Things are independent in so far as they can occur in all possible situations, but this form of independence is a form of connexion with states of affairs, a form of dependence. (It is impossible for words to appear in two different rôles: by themselves, and in propositions.)
  1. What is the difference between a "connection" and a "connexion"?
  2. What is the difference between a "role" and a "rôle"
  "Dependent" is that they appear as input to the same program and it would just make sense for them to be used together?
  
  Does Wittgenstein use "propositions" to mean "functions"? If yes, then this would mean that functions are not the same thing as variables, right?
  
  Or, rather, reasoning about types one have to give a name to a type, and this name cannot be used for a function at the same time.
3. 2.0123 If I know an object I also know all its possible occurrences in states of affairs. (Every one of these possibilities must be part of the nature of the object.) A new possibility cannot be discovered later.
  
  I think this, again, means that, essentially, "types exist". That is, a C++ Point{int,int}, can only be what I have described, not Point{int,int,int}.
4. TODO 2.01231 If I am to know an object, though I need not know its external properties, I must know all its internal properties.
  
  This I do not understand. What are external and internal properties? If by "internal properties", Wittgenstein means "state", or even "initial state", then this is reasonable.
  
  And "external" means… computable? Like, we do not need to know that P(a) = 4, but we need to know what a is to work with it.
5. 2.0124 If all objects are given, then at the same time all possible states of affairs are also given.
  
  Again, the idea seems to be that we can compute with types instead of instances.
2.013 Each thing is, as it were, in a space of possible states of affairs. This space I can imagine empty, but I cannot imagine the thing without the space.

Again, the input can be empty. But we cannot compute without input.
1. 2.0131 A spatial object must be situated in infinite space. (A spatial point is an argument-place.) A speck in the visual field, though it need not be red, must have some colour: it is, so to speak, surrounded by colour-space. Notes must have some pitch, objects of the sense of touch some degree of hardness, and so on.
  
  Again, this seems to mean that we describe things with parameters.
2.014 Objects contain the possibility of all situations.

Well, if the only thing we have is the input string, then we have nowhere else to draw information.
1. 2.0141 The possibility of its occurring in states of affairs is the form of an object.
  It's a bit confusing, but I feel that what he actually means is the following:
  1. By computing with types, we can get all possible "states of affairs", that is, all possible evolutions of the system defined by the input.
  2. By choosing one set of parameters describing an object (its "form"), we specialise the system to obtain a smaller set of evolutions.

2.0.2. TODO 2.02 Objects are simple.

I do not understand.

2.0201 Every statement about complexes can be resolved into a statement about their constituents and into the propositions that describe the complexes completely.

This is kinda again, about defining complex types from primitive types. I guess, Wittgenstein means that there should be a set of primitive values that have primitive operations working on them (and calls "modern primitive objects" objects).

2.021 Objects make up the substance of the world. That is why they cannot be composite.

Here we would have to consider bit-wise operations. It seems that in modern programming we sometimes can extract pieces of primitive objects.

2.0211 If the world had no substance, then whether a proposition had sense would depend on whether another proposition was true.

Hm… suppose our function has no input. Is it the same as "no substance"?
2.0212 In that case we could not sketch any picture of the world (true or false).

Well, again, if your function has no input and no state, then it can only be constant, right? (Perhaps, a constant trajectory.)

TODO 2.022 It is obvious that an imagined world, however different it may be from the real one, must have something —a form— in common with it.

Interesting point. I do not understand it. Let us image that our function implements a general-purpose programming language. The it would still eventually have to be reduced to the primitives of the first function. Right?

2.023 Objects are just what constitute this unalterable form.

So that is still bits?

2.0231 The substance of the world can only determine a form, and not any material properties. For it is only by means of propositions that material properties are represented — only by the configuration of objects that they are produced.

This seems like a re-iteration of the fact that numbers do not mean anything by themselves, only in conjunction with their interpretation.

TODO 2.023 In a manner of speaking, objects are colourless.

Does it mean that "bits mean nothing"?

2.0233 If two objects have the same logical form, the only distinction between them, apart from their external properties, is that they are different.

So (255,0,0) can mean a red colour, or a coordinate of a 3d point.
1. 2.02331 Either a thing has properties that nothing else has, in which case we can immediately use a description to distinguish it from the others and refer to it; or, on the other hand, there are several things that have the whole set of their properties in common, in which case it is quite impossible to indicate one of them. For if there is nothing to distinguish a thing, I cannot distinguish it, since otherwise it would be distinguished after all.
  
  Is this "type punning"? If things are described with the same set of parameters, then reading the memory cannot tell you which of the things was actually meant. If you add an explicit type qualifier, it is kinda "a parameter", isn't it?

2.024 Substance is what subsists independently of what is the case.

Bits do not know anything about "reality".

2.025 It is form and content.

So… bits are the only thing that really exists.

2.0251 Space, time, and colour (being coloured) are forms of objects.

Or a concrete juxtaposition of bit values.

TODO 2.026 There must be objects, if the world is to have an unalterable form.

I do not understand. Well, let us assume that we can partition the input into "objects". Then, unless we can re-partition the input, the objects keep being objects.

2.027 Objects, the unalterable, and the subsistent are one and the same.

Like, the immutable input string.

2.0271 Objects are what is unalterable and subsistent; their configuration is what is changing and unstable.

Seems naturally following from the above.
2.0272 The configuration of objects produces states of affairs.

That is input.

2.0.3. TODO 2.03 In a state of affairs objects fit into one another like the links of a chain.

I do not understand. Does it mean that the input is sequential?

2.031 In a state of affairs objects stand in a determinate relation to one another.

Again, if you partition the input into arguments, their order is the only thing that defines which one is which, because otherwise they are just bits.
2.032 The determinate way in which objects are connected in a state of affairs is the structure of the state of affairs.

That is "a function prototype", isn't it?
TODO 2.033 Form is the possibility of structure.

Unclear.
2.034 The structure of a fact consists of the structures of states of affairs.

I think that by a "fact" he means "what is true". A "structure of truth". It is a bit hard to explain.

Perhaps we can see this as saying: the world is the evolution of the memory. Things that logic (or machine) can compute at the same time, will be in the memory.

2.0.4. 2.04 The totality of existing states of affairs is the world.

Again, we only have the input as the thing that provides knowledge.

2.0.5. 2.05 The totality of existing states of affairs also determines which states of affairs do not exist.

Or which inputs are prohibited.

2.0.6. TODO 2.06 The existence and non-existence of states of affairs is reality. (We also call the existence of states of affairs a positive fact, and their non-existence a negative fact.)

We can compute that the input we are given is meaningless or contradictory. This is our "negative fact". Or, we can infer that a certain situation can never possibly occur.

2.0.7. 2.061 States of affairs are independent of one another.

Because they represent different, mutually exclusive, machine states.

2.062 From the existence or non-existence of one state of affairs it is impossible to infer the existence or nonexistence of another.

Because they are inputs, right?
2.063 The sum-total of reality is the world.

Or, rather, everything that is computable?

2.1. 2.1 We picture facts to ourselves.

Isn't that again, that all logic needs an interpreter?

2.1.1. 2.11 A picture presents a situation in logical space, the existence and non-existence of states of affairs.

Is that a "state of memory" again?

2.1.2. 2.12 A picture is a model of reality.

That's just it? What am I missing here?

2.1.3. 2.13 In a picture objects have the elements of the picture corresponding to them.

This seems to be, again about a mapping between reality and computing.

2.131 In a picture the elements of the picture are the representatives of objects.

So, the "picture" is expected to me a representation of the "form"?

2.1.6. 2.16 If a fact is to be a picture, it must have something in common with what it depicts.

What is "it" here? A picture or a fact?

2.161 There must be something identical in a picture and what it depicts, to enable the one to be a picture of the other at all.

Seems, again to stress that the representation is non-ideal.

2.1.7. 2.17 What a picture must have in common with reality, in order to be able to depict it —correctly or incorrectly— in the way it does, is its pictorial form.

All, right. So, there is what we would have called a "canonical pictorial form" of reality, that represents it exhaustively. And "our picture", non-ideal, must have the same skeleton as the "canonical picture".

2.171 A picture can depict any reality whose form it has. A spatial picture can depict anything spatial, a coloured one anything coloured, etc.

Perhaps I was completely wrong about the previous point. It seems we are speaking again about the types and data structures.
2.172 A picture cannot, however, depict its pictorial form: it displays it.

Like, this sounds like an "area of applicability", "area of effect", or "domain". A picture, indeed, cannot display "a picture".
2.173 A picture represents its subject from a position outside it. (Its standpoint is its representational form.) That is why a picture represents its subject correctly or incorrectly.

The "representational form" here seems to be "the way representation is designed". For example, a picture may consist of a grid of pixels.

Correctness and incorrectness here seem under-specified. There may be pictures that cannot be represented by a grid of pixels.

On the other hand, a potentially valid representation may be "just wrong". For example, if an apple is represented as an image of a plum.
2.174 A picture cannot, however, place itself outside its representational form.

A grid of pixels cannot represent something that is not representable as a grid of pixels.

2.1.8. 2.18 What any picture, of whatever form, must have in common with reality, in order to be able to depict it —correctly or incorrectly— in any way at all, is logical form, i.e. the form of reality.

So… this… "grid of pixels" is, a "logical form". Perhaps, since you can express pixels as bits.

2.181 A picture whose pictorial form is logical form is called a logical picture.

Seems a tautological statement, or… like… the thing that is in essence a "universal Turing machine".
2.182 Every picture is at the same time a logical one. (On the other hand, not every picture is, for example, a spatial one.)

Yeah, since we can eventually interpret everything as bits and predicates.

2.1.9. 2.19 Logical pictures can depict the world.

Really? Well, but how well?

2.2. 2.2 A picture has logico-pictorial form in common with what it depicts.

I guess, bits should be similar to bits. But what about different data structures?

2.201 A picture depicts reality by representing a possibility of existence and non-existence of states of affairs.

And the existence and non-existence are the primary question of logic. Basically, everything can be reduced to logic.
2.202 A picture represents a possible situation in logical space.

A configuration of bits.
2.203 A picture contains the possibility of the situation that it represents.

Since it is given.

2.2.1. 2.21 A picture agrees with reality or fails to agree; it is correct or incorrect, true or false.

What about imprecise representations?

2.2.2. 2.22 What a picture represents it represents independently of its truth or falsity, by means of its pictorial form.

Does it mean, that we can, basically, draw pictures of imaginary things?

TODO 2.221 What a picture represents is its sense.

What is "sense"? This seems one of the "leaves" that deals with inexplicable.
2.222 The agreement or disagreement of its sense with reality constitutes its truth or falsity.

Em… is it like, we are drawing something that is not correctly representing reality… And if it represents it incorrectly, we say that it is false?
2.223 In order to tell whether a picture is true or false we must compare it with reality.

Great! How?
2.224 It is impossible to tell from the picture alone whether it is true or false.

Sure, you can generate a configuration of bits at random.
2.225 There are no pictures that are true a priori.

Again, because you can generate everything.

3. 3 A logical picture of facts is a thought.

This seems to mean that the only thing that is in order to even get in contact with the facts, we need to picture them in a logical substrate (express in bits).

3.001 ‘A state of affairs is thinkable’: what this means is that we can picture it to ourselves.

I think that some states of affairs are not thinkable. But those that are thinkable, we should be able to express as our internal logical language.

3.0.1. 3.01 The totality of true thoughts is a picture of the world.

Maybe incomplete, right?

3.0.2. 3.02 A thought contains the possibility of the situation of which it is the thought. What is thinkable is possible too.

Seems that the definition of "possible" is strange here. I guess, the axioms may be strange.

3.0.3. 3.03 Thought can never be of anything illogical, since, if it were, we should have to think illogically.

Well, if logic is "the driving force" of thinking, it has to be like this…

3.031 It used to be said that God could create anything except what would be contrary to the laws of logic. — The truth is that we could not say what an ‘illogical’ world would look like.

Because, supposedly, a computing brain is driven by logic and cannot transcend it.
3.032 It is as impossible to represent in language anything that ‘contradicts logic’ as it is in geometry to represent by its co-ordinates a figure that contradicts the laws of space, or to give the co-ordinates of a point that does not exist.

This raises a question whether we can say anything about uncomputable numbers.
1. 3.0321 Though a state of affairs that would contravene the laws of physics can be represented by us spatially, one that would contravene the laws of geometry cannot.
  
  That is a "comment", right? So, "logical" does not necessarily mean "true". May be false, but still logical.

3.0.4. 3.04 If a thought were correct a priori, it would be a thought whose possibility ensured its truth.

But they are not, aren't they? On the other hand, perhaps, it could be possible to devise a logical system which would not be able to express false statements?

3.0.5. 3.05 A priori knowledge that a thought was true would be possible only if its truth were recognizable from the thought itself (without anything to compare it with).

That's, like, the laws of your logical system? Or, perhaps, functions without input correspond to "apriori truths"?

3.1. 3.1 In a proposition a thought finds an expression that can be perceived by the senses.

A thought is a "logical picture of facts". Proposition, here, perhaps, means "something that we can compare with the senses".

3.1.1. 3.11 We use the perceptible sign of a proposition (spoken or written, etc.) as a projection of a possible situation. The method of projection is to think of the sense of the proposition.

Apparently, the language of bits (logic) is more general that specific sub-languages that describe measurable things. We project therefore.

3.1.3. 3.13 A proposition includes all that the projection includes, but not what is projected. Therefore, though what is projected is not itself included, its possibility is. A proposition, therefore, does not actually contain its sense, but does contain the possibility of expressing it. (‘The content of a proposition’ means the content of a proposition that has sense.) A proposition contains the form, but not the content, of its sense.

A proposition is a perceptible projection of a thought.

3.1.4. 3.14 What constitutes a propositional sign is that in it its elements (the words) stand in a determinate relation to one another. A propositional sign is a fact.

Are "words" here real words, or formal combinations of letters? Are data structures words? How is a "determinate relation" formalised?

TODO 3.141 A proposition is not a blend of words. —(Just as a theme in music is not a blend of notes.) A proposition is articulate.

How exactly? Are there rules to this articulation?
3.142 Only facts can express a sense, a set of names cannot.

I do not understand. So, there is some substance?
3.143 Although a propositional sign is a fact, this is obscured by the usual form of expression in writing or print. For in a printed proposition, for example, no essential difference is apparent between a propositional sign and a word. (That is what made it possible for Frege to call a proposition a composite name.)

So, basically, we express our propositions in a language, that is "physical", and bits are encoded in memory, however, what we actually mean are logical propositions.
1. 3.1431 The essence of a propositional sign is very clearly seen if we imagine one composed of spatial objects (such as tables, chairs, and books) instead of written signs. Then the spatial arrangement of these things will express the sense of the proposition.
  
  We would still have to assign a meaning to the objects, right?
2. 3.1432 Instead of, ‘The complex sign “aRb” says that a stands to b in the relation R’, we ought to put, ‘That “a” stands to “b” in a certain relation says that aRb.’
  
  Because a proposition is a projection of a fact.
3.144 Situations can be described but not given names. (Names are like points; propositions like arrows — they have sense.)

What are "situations"? It seems that giving a name to a set is fine. Why not?

3.2. 3.2 In a proposition a thought can be expressed in such a way that elements of the propositional sign correspond to the objects of the thought.

So we are writing this "proposition" to represent a thought (a logical picture of facts) in some way. With words, I presume.

3.201 I call such elements ‘simple signs’, and such a proposition ‘completely analysed’.

If a proposition is properly written, so that "words", or "simple signs" correspond to reality, then is it "completely analysed"?

Or, let's try to think about it from another angle: If you have a function, an ⍺-abstraction of something, and you do an "analysis" step, that is you resolve all scheme symbols in a form and obtain their memory values, then you are also getting something "completely analysed".
3.202 The simple signs employed in propositions are called names.

In Scheme do "names" correspond to "symbols" at an analysis stage?
3.203 A name means an object. The object is its meaning. (‘A’ is the same sign as ‘A’.)

So, "symbols" are unique. Do they evaluate to themselves?

3.2.1. 3.21 The configuration of objects in a situation corresponds to the configuration of simple signs in the propositional sign.

Necessarily? By construction? Doesn't seem to be the case, although if you are only considering exact correspondences between logic and Scheme, it's probably true.

3.2.2. 3.22 In a proposition a name is the representative of an object.

So, a procedure.

3.221 Objects can only be named. Signs are their representatives. I can only speak about them: I cannot put them into words. Propositions can only say how things are, not what they are.

That seems arguable. I should be able to "serialise" objects.

Propositions, functions, evidently, predict behaviour, not describe things.

3.2.3. 3.23 The requirement that simple signs be possible is the requirement that sense be determinate.

Well, otherwise a program is broken. But what does it mean "possible"? Resolve to themselves or resolve to values?

3.2.4. 3.24 A proposition about a complex stands in an internal relation to a proposition about a constituent of the complex. A complex can be given only by its description, which will be right or wrong. A proposition that mentions a complex will not be nonsensical, if the complex does not exist, but simply false. When a propositional element signifies a complex, this can be seen from an indeterminateness in the propositions in which it occurs. In such cases we know that the proposition leaves something undetermined. (In fact the notation for generality contains a prototype.) The contraction of a symbol for a complex into a simple symbol can be expressed in a definition.

So a definition is kinda like in Scheme. You use the special definition word to make a simple symbol 'symbol resolve to a value, maybe complex.

If our forms use a symbol with an undefined value, then what? This symbol resolves to false? Seems so.

What is a prototype here?

3.2.5. TODO 3.25 A proposition has one and only one complete analysis.

Because we are, again, just substituting symbols?

3.251 What a proposition expresses it expresses in a determinate manner, which can be set out clearly: a proposition is articulate.

So, a proposition is a "formal construction"?

3.2.6. 3.26 A name cannot be dissected any further by means of a definition: it is a primitive sign.

Like, this is, perhaps, that place where "symbols evaluate to themselves".

3.261 Every sign that has a definition signifies via the signs that serve to define it; and the definitions point the way. Two signs cannot signify in the same manner if one is primitive and the other is defined by means of primitive signs. Names cannot be anatomized by means of definitions. (Nor can any sign that has a meaning independently and on its own.)

So, I cannot implement a function that behaves in an exactly the same way as a built-in?

Names, indeed, cannot be broken into pieces, because a definition is only an association.
3.262 What signs fail to express, their application shows. What signs slur over, their application says clearly.

I think what we can understand "application" here in the same way we understand an application in Scheme. Indeed, not everything is clear from a function definition, but running a function should give us all information about it.

Or not? At least, we cannot solve the halting problem like this. But logic cannot either.
3.263 The meanings of primitive signs can be explained by means of elucidations. Elucidations are propositions that contain the primitive signs. So they can only be understood if the meanings of those signs are already known.

We usually explain primitives in English. In the Scheme Report, the primitives are defined though something called "Denotational Semantics".

I guess, what he really says here is that if we know a "meaning of a function", or its truth-table, we could, potentially, infer the meaning of the primitives, perhaps in a way similar to solving an equation.

3.3. 3.3 Only propositions have sense; only in the nexus of a proposition does a name have meaning.

I guess, we can have propositions that are very simple? Such as "resolving a value of a name"?

3.3.1. 3.31 I call any part of a proposition that characterizes its sense an expression (or a symbol). (A proposition is itself an expression.) Everything essential to their sense that propositions can have in common with one another is an expression. An expression is the mark of a form and a content.

So… expressions are like meaningful sub-units of propositions. There maybe other things in a proposition, but those are not related to its meaning. For example, there may be debugging or type annotations.

But how do "form and content" come together here?

3.311 An expression presupposes the forms of all the propositions in which it can occur. It is the common characteristic mark of a class of propositions.

A class of "propositions using this expression"? Perhaps, but what would it give us?

Inlining for speed?
3.312 It is therefore presented by means of the general form of the propositions that it characterizes. In fact, in this form the expression will be constant and everything else variable.

Totally confusing? What is a "general form of the propositions"?

Is it even possible to devise a non-trivial "general form of propositions using an expression A"?

Does the phrase "everything else variable" make this clause useless?
3.313 Thus an expression is presented by means of a variable whose values are the propositions that contain the expression. (In the limiting case the variable becomes a constant, the expression becomes a proposition.) I call such a variable a ‘propositional variable’.

Okay, in "high-school logic", a propositional variable is a variable that can be true or false. I guess, you can say that a variable becomes true if an corresponding expression evaluates to true.
3.314 An expression has meaning only in a proposition. All variables can be construed as propositional variables. (Even variable names.)

It seems that Wittgenstein dislikes REPLs. Or that "top-level is not very well defined".

Otherwise, it seems that, indeed, all "expressions" can be assigned to variables. (Note, however, that Scheme allows for syntactic expressions.)

"All variables can be construed as propositional variables (even variable names)", I guess, means that (=? (symbol->string 'variable) "variable") => #t
3.315 If we turn a constituent of a proposition into a variable, there is a class of propositions all of which are values of the resulting variable proposition. In general, this class too will be dependent on the meaning that our arbitrary conventions have given to parts of the original proposition. But if all the signs in it that have arbitrarily determined meanings are turned into variables, we shall still get a class of this kind. This one, however, is not dependent on any convention, but solely on the nature of the proposition. It corresponds to a logical form — a logical prototype.

Hm… This "logical prototype" seems to be a function that has no constants? It can have calls to other functions or values, but those can only be supplied as arguments…

Are this, like… pure functions?

Or, is the idea in that a function with no constants is actually equivalent to a type?
3.316 What values a propositional variable may take is something that is stipulated. The stipulation of values is the variable.

Seems, again, that now we are defining a "type" for a variable (as opposed to a function (proposition)).
3.317 To stipulate values for a propositional variable is to give the propositions whose common characteristic the variable is. The stipulation is a description of those propositions. The stipulation will therefore be concerned only with symbols, not with their meaning. And the only thing essential to the stipulation is that it is merely a description of symbols and states nothing about what is signified. How the description of the propositions is produced is not essential.

Seems like… So, here we are defining a variable (a data structure), via a set of accessors to this data structure. The "meaning" here is really the "implementation" of the data structure and the accessors.
3.318 Like Frege and Russell I construe a proposition as a function of the expressions contained in it.

I should look up what Frege and Russel did.

But it is quite reasonable that you can make expressions by combining more primitive expressions.

3.3.2. 3.32 A sign is what can be perceived of a symbol.

So, in Scheme symbols stand for stuff that can be resolved to values (maybe undefined), and can be compared in O(1).

In Lisp in general, people are even expected to speak directly with symbols in runtime.

I guess, what Wittgenstein is saying here, is that we do not really need a disjoint type for symbols in programming languages. (lookup-variable-value) would be designed to work with strings, not symbols.

3.321 So one and the same sign (written or spoken, etc.) can be common to two different symbols — in which case they will signify in different ways.

Identifiers shadow other identifiers. That is the same thing can have different values in different contexts.
3.322 Our use of the same sign to signify two different objects can never indicate a common characteristic of the two, if we use it with two different modes of signification. For the sign, of course, is arbitrary. So we could choose two different signs instead, and then what would be left in common on the signifying side?

Same thing. Using the same variable name for different values in different contexts does not make those values in any sense related.
3.323 In everyday language it very frequently happens that the same word has different modes of signification — and so belongs to different symbols — or that two words that have different modes of signification are employed in propositions in what is superficially the same way. Thus the word ‘is’ figures as the copula, as a sign for identity, and as an expression for existence; ‘exist’ figures as an intransitive verb like ‘go’, and ‘identical’ as an adjective; we speak of something, but also of something’s happening. (In the proposition, ‘Green is green’ — where the first word is the proper name of a person and the last an adjective — these words do not merely have different meanings: they are different symbols.)

So, Wittgenstein is suggesting to have a separate obarray for each context? Hmm, no, doesn't seem to be the case. On the other hand, "Green" and "green" seem to resolve to two different values in the same sentence.

On the other hand, maybe it is just that the Sign and a Symbol are reversed?
3.324 In this way the most fundamental confusions are easily produced (the whole of philosophy is full of them).

Programming as well! That is why we sometimes think that static types should save us. With static types at least the type of resolved variables should match.
3.325 In order to avoid such errors we must make use of a sign-language that excludes them by not using the same sign for different symbols and by not using in a superficially similar way signs that have different modes of signification: that is to say, a sign-language that is governed by logical grammar — by logical syntax. (The conceptual notation of Frege and Russell is such a language, though, it is true, it fails to exclude all mistakes.)

This seems to be a prescriptive clause.

Ok, my thought. The difficulty with human languages is that we are ready to search different obarrays for variable resolution. And if an obarray fits the propositional structure, we start resolving most variables from that obarray.

People seem to have several environments of evaluation of an incoming proposition. The correct obarray is chosen if all variables of a clause are resolvable from this environment. If all variables are resolvable from two environments, it's either a tautology, or a pun.

So, for clarity, we should stick to using a single obarray.

Upd: this is the first place where he speaks about "logical syntax", not even defining it properly.
3.326 In order to recognize a symbol by its sign we must observe how it is used with a sense.

That seems to be it! That is the ambiguity resolution heuristic. One variable alone may be resolvable from environment-1, but we need all of them resolve.
3.327 A sign does not determine a logical form unless it is taken together with its logico-syntactical employment.

So, programs may consist of everything, essentially. The programming system is not defined by words used for if or goto, those could have been si or aller.
3.328 If a sign is useless, it is meaningless. That is the point of Occam’s maxim. (If everything behaves as if a sign had meaning, then it does have meaning.)

Again, the sign (Scheme-symbol) is eliminated at read-time, and replaced with a memory pointer, an internal representation. In this sense, it does not matter.

3.3.3. 3.33 In logical syntax the meaning of a sign should never play a rôle. It must be possible to establish logical syntax without mentioning the meaning of a sign: only the description of expressions may be presupposed.

The idea here seems to be that programming language standards may not have any prose.

That is there should be some description of clause transformations under an effect of syntactic signs.

I think that syntactic signs here are not just what Scheme calls "syntax", but also what Scheme calls "procedures".

3.331 From this observation we turn to Russell’s ‘theory of types’. It can be seen that Russell must be wrong, because he had to mention the meaning of signs when establishing the rules for them.

I have no idea what Russel's 'theory of types' was and whether is had any relation to the modern static typing.
3.332 No proposition can make a statement about itself, because a propositional sign cannot be contained in itself (that is the whole of the ‘theory of types’).

Perhaps, this can be related to Footnote 80 in Structure and Interpretation of Computer Programs (SICP). (page 649 in the Unofficial Texinfo Format 2.andresraba5.6)

When making recursive propositions, some very strange infinite series may arise.
TODO 3.333 The reason why a function cannot be its own argument is that the sign for a function already contains the prototype of its argument, and it cannot contain itself. For let us suppose that the function F(fx) could be its own argument: in that case there would be a proposition ‘F(F(fx))’, in which the outer function F and the inner function F must have different meanings, since the inner one has the form φ(fx) and the outer one has the form ψ(φ(fx)). Only the letter ‘F’ is common to the two functions, but the letter by itself signifies nothing. This immediately becomes clear if instead of ‘F(Fu)’ we write ‘(∃φ):F(φu).φu = Fu’. That disposes of Russell’s paradox.

I do not understand 😢.
3.334 The rules of logical syntax must go without saying, once we know how each individual sign signifies.

Is it a prescriptive clause? Moreover, it's "how each sign signifies", not "what each sign signifies". I am confused.

Perhaps, it means the same thing: that as long as you have properly written syntax as clause transformations, you do not need to explain the semantics "in English".

3.3.4. 3.34 A proposition possesses essential and accidental features. Accidental features are those that result from the particular way in which the propositional sign is produced. Essential features are those without which the proposition could not express its sense.

It seems that the difference here is, again, between "a program" and "a correct program". Performance seems to be the issue here as well.

3.341 So what is essential in a proposition is what all propositions that can express the same sense have in common. And similarly, in general, what is essential in a symbol is what all symbols that can serve the same purpose have in common.

A "symbol" is "a value" here. So, the propositions that are "essentially the same" would be: (and a b) and (and a b #t) and (and a b "true") #t and "true" can serve as the same true value.
1. 3.3411 So one could say that the real name of an object was what all symbols that signified it had in common. Thus, one by one, all kinds of composition would prove to be unessential to a name.
  
  "all kinds of composition would prove to be unessential to a name." what does it even mean?
  
  When you do not care about resource consumption, you can avoid using a name at all, and try to infer the correct value by writing all sorts of propositions about an object and telling the machine to infer the object from those propositions when needed. It it what is implied here?
3.342 Although there is something arbitrary in our notations, this much is not arbitrary — that when we have determined one thing arbitrarily, something else is necessarily the case. (This derives from the essence of notation.)

I guess, this means that a variable cannot be "empty". If you introduce a name in a proposition, say, (display a), then in the worst case your interpreter will tell you "a is undefined", that is "the value of a is #undef". This may be an error, or may not be. But a will be in the obarray until the termination of the program, so it will have some kind of "value".
1. 3.3421 A particular mode of signifying may be unimportant but it is always important that it is a possible mode of signifying. And that is generally so in philosophy: again and again the individual case turns out to be unimportant, but the possibility of each individual case discloses something about the essence of the world.
  
  Well, each copy of a program is philosophically unimportant, but the fact that a program as a packaged thought exists, is quite extraordinary philosophically. Moreover, programs generally seem to encompass the way of thinking of their authors.
3.343 Definitions are rules for translating from one language into another. Any correct sign-language must be translatable into any other in accordance with such rules: it is this that they all have in common.

That should be the case if, once again, you ignore Input/Output, and, possibly, mutation. I guess, then you should be left with Turing-completeness? Or, maybe, some other kind of completeness?
3.344 What signifies in a symbol is what is common to all the symbols that the rules of logical syntax allow us to substitute for it.

I feel like he's hand-waving a lot about the "logical syntax", not even defining it properly.

On the other hand, again, if we define a datatype through the accessors, then "what is common to all the symbols" just means that "semantics allows us to abstract over concrete implementations, as long as the accessors still work".
1. 3.3441 For instance, we can express what is common to all notations for truth-functions in the following way: they have in common that, for example, the notation that uses ‘~p’ (‘not p’) and ‘p v q’ (‘p or q’) can be substituted for any of them. (This serves to characterize the way in which something general can be disclosed by the possibility of a specific notation.)
  
  So, is this just about making sure that Scheme and C are equivalent in the logical sense?
2. 3.3442 Nor does analysis resolve the sign for a complex in an arbitrary way, so that it would have a different resolution every time that it was incorporated in a different proposition.
  
  It would have been a nice testing heuristic.
  
  I am not very sure, but it seems that "analysis" in this clause really is the same as the "analysis" in SICP. When a complex sign is transformed into a lambda, there is only one way of doing so throughout the program? Or isn't it?
  
  Not very sure.

3.4. 3.4 A proposition determines a place in logical space. The existence of this logical place is guaranteed by the mere existence of the constituents — by the existence of the proposition with a sense.

This is kind of obvious. Your proposition will eventually be broken down into the elementary variables, bits, that can be 1 or 0, and that is "the space" and "the place". The proposition will have sense, but may be false.

3.4.1. 3.41 The propositional sign with logical co-ordinates — that is the logical place.

Obvious.

3.411 In geometry and logic alike a place is a possibility: something can exist in it.

Sure.

3.4.2. 3.42 A proposition can determine only one place in logical space: nevertheless the whole of logical space must already be given by it. (Otherwise negation, logical sum, logical product, etc.; would introduce more and more new elements — in co-ordination.) (The logical scaffolding surrounding a picture determines logical space. The force of a proposition reaches through the whole of logical space.)

Kinda means that a relation is a subset of the Cartesian product of the input and output. First year mathematical logic.

3.5. 3.5 A propositional sign, applied and thought out, is a thought.

A "thought" is a logical picture of facts, as he defines it. So, evaluating a proposition should give us a "logical picture of facts"?

4. 4 A thought is a proposition with a sense.

Don't all propositions have sense by construction? Or, maybe, it's just complex propositions that are guaranteed to have sense. Basic ones may not have any sense.

4.001 The totality of propositions is language.

Theoretical totality or practical totality? What about propositions that have no sense?
4.002 Man possesses the ability to construct languages capable of expressing every sense, without having any idea how each word has meaning or what its meaning is — just as people speak without knowing how the individual sounds are produced. Everyday language is a part of the human organism and is no less complicated than it. It is not humanly possible to gather immediately from it what the logic of language is. Language disguises thought. So much so, that from the outward form of the clothing it is impossible to infer the form of the thought beneath it, because the outward form of the clothing is not designed to reveal the form of the body, but for entirely different purposes. The tacit conventions on which the understanding of everyday language depends are enormously complicated.

This clause is almost saying that "The theory in this book should not be applied to human languages, really". Leave "exact symbolism" to machines.
4.003 Most of the propositions and questions to be found in philosophical works are not false but nonsensical. Consequently we cannot give any answer to questions of this kind, but can only point out that they are nonsensical. Most of the propositions and questions of philosophers arise from our failure to understand the logic of our language. (They belong to the same class as the question whether the good is more or less identical than the beautiful.) And it is not surprising that the deepest problems are in fact not problems at all.

To me at least, this clause should be interpreted in the following way: What "makes sense" is really equivalent to "what we can write a program about, deciding whether things are true or false".

The "deep questions proposed in philosophy that have no logical sense" are not really lacking sense, they are illustrating the difference between that part of the language (and with it, the reality) that we have already harnessed by logic (and thus can reason about, and write programs about), and the part that we are capable of operating with (as humans), but do not understand well enough and deep enough to write programs about.

This is the central raison-d-être of philosophy: spotting those areas that exist, but to which the logical abstraction tree that allows us to create sciences and societies has not yet grown.
1. 4.0031 All philosophy is a ‘critique of language’ (though not in Mauthner’s sense). It was Russell who performed the service of showing that the apparent logical form of a proposition need not be its real one.
  
  Fritz Mauthner, German author, theatre critic, and exponent of philosophical Skepticism derived from a critique of human knowledge.
  
  I'd really like him to point out where exactly Russel shows this.
  
  If we recall Eric Berne here, who introduced the notion of "transactions" in human interaction, and suggested that superficial transaction structure may not be the underlying one. "Let me show you my wonderful haystack in the barn" may be interpreted ambiguously.

4.0.1. 4.01 A proposition is a picture of reality. A proposition is a model of reality as we imagine it.

Indeed. We are inputting those propositions into the machine, and thus creating a model of reality. Then we can query the machine and see if your model of reality is any good.

"We imagine it" should be really read as "as the program describes it".

4.011 At first sight a proposition — one set out on the printed page, for example — does not seem to be a picture of the reality with which it is concerned. But neither do written notes seem at first sight to be a picture of a piece of music, nor our phonetic notation (the alphabet) to be a picture of our speech. And yet these sign-languages prove to be pictures, even in the ordinary sense, of what they represent.

In the "ordinary sense" here should hint us that machines should eventually recognise code written on paper with pen, and be able to interpret it.

"At first sight" should hint us that we are missing a huge lot when discussing programs without data (!). This is very important, as garbage-in=garbage-out.

Notes are a nice illustration here, because in order to generate wave-forms from notes, you need to have a sound bank, or do FM-synthesis.
4.012 It is obvious that a proposition of the form ‘aRb’ strikes us as a picture. In this case the sign is obviously a likeness of what is signified.

Well, since logic is such a primitive domain, similar to manipulation of characters.
4.013 And if we penetrate to the essence of this pictorial character, we see that it is not impaired by apparent irregularities (such as the use of ♯ and ♭ in musical notation). For even these irregularities depict what they are intended to express; only they do it in a different way.

Emm..? How are sharp and flat "irregularities"? Aren't they a valid part of music notation?

Let us try to not overthink it. Music notation consists of notes. Obviously, sharp and flat are extraneous here.
4.014 A gramophone record, the musical idea, the written notes, and the sound-waves, all stand to one another in the same internal relation of depicting that holds between language and the world. They are all constructed according to a common logical pattern. (Like the two youths in the fairy-tale, their two horses, and their lilies. They are all in a certain sense one.)

So… a piece of memory, a file, is "real". We can take this text, written in an org file, and put it through a TTS engine, to obtain its audio form, or put it onto the display, to obtain a picture.

What I do not like here, is the mention of the fairy-tale. Horses and lilies are aspects of the concept, not views. A pair of lovers in a fairy tale is not necessarily bound to have a pair of horses. But maybe that is just me.
1. 4.0141 There is a general rule by means of which the musician can obtain the symphony from the score, and which makes it possible to derive the symphony from the groove on the gramophone record, and, using the first rule, to derive the score again. That is what constitutes the inner similarity between these things which seem to be constructed in such entirely different ways. And that rule is the law of projection which projects the symphony into the language of musical notation. It is the rule for translating this language into the language of gramophone records.
  
  Fair point. Seems to be exactly about what I wrote above. The "symphony" is bytes, and we can generate different presentations of those bytes.
4.015 The possibility of all imagery, of all our pictorial modes of expression, is contained in the logic of depiction.

I would say "bound" rather than "contained". Your expressions have to be valid with respect to the language that you are writing them in. An you can, in principle, describe "all possible images" that can be written in, say, a 32-bit TIFF file. That set is not just well-defined, it is finite.
4.016 In order to understand the essential nature of a proposition, we should consider hieroglyphic script which depicts the facts that it describes. And alphabetic script developed out of it without losing what was essential to depiction.

Not very useful illustration, but doesn't hurt.

4.0.2. 4.02 We can see this from the fact that we understand the sense of a propositional sign without its having been explained to us.

Well, if the syntax of your proposition is correct…

4.021 A proposition is a picture of reality: for if I understand a proposition, I know the situation that it represents. And I understand the proposition without having had its sense explained to me.

If my compiler compiles the code, it more or less means that it understands it as much as it understands anything.
4.022 A proposition shows its sense. A proposition shows how things stand if it is true. And it says that they do so stand.

I think, this a little bit refers to the constructivist approach of computer logic. In order to formulate a computable proposition, you need to formulate it in a constructive way, "how to check that it is correct".

Does it mean that Wittgenstein's theory does not work for anything non-constructive?
4.023 A proposition must restrict reality to two alternatives: yes or no. In order to do that, it must describe reality completely. A proposition is a description of a state of affairs. Just as a description of an object describes it by giving its external properties, so a proposition describes reality by its internal properties. A proposition constructs a world with the help of a logical scaffolding, so that one can actually see from the proposition how everything stands logically if it is true. One can draw inferences from a false proposition.

This yes-or-no reduction is very common in Computer Science. Very few functions in Computer Science are formulated as "What is f(x)?" Rather, they are formulated as "Is the first bit of f(x) equal to 1?".

The dependency between internal and external properties is not so obvious. We may look at it from an information-theoretic standpoint: describing an object's external properties, as long as those properties are consistent with the basic laws of the world, we are (in general) limiting its internal properties as well.

Propositions allow us to infer these internal properties.

Example: a vector linear equation \(Ax=0\). This is an external property of the set of \(x\).

An internal property would be the fact any linear combination of \(x_1\) and \(x_2\) would also be a solution. Or something like that.
4.024 To understand a proposition means to know what is the case if it is true. (One can understand it, therefore, without knowing whether it is true.) It is understood by anyone who understands its constituents.

Propositions are created according to a set of "laws of the world", aren't they?

Maybe, "to understand a proposition" here means "being able to read a proposition"? E.g. a compiler "understands" a function, if it is written in a language that it is written to understand.

A Scheme compiler understands Scheme. If a function written in Scheme is correctly written, the compiler may run it and produce a result that will be correct. (Because we believe that the function itself is correct.)
4.025 When translating one language into another, we do not proceed by translating each proposition of the one into a proposition of the other, but merely by translating the constituents of propositions. (And the dictionary translates not only substantives, but also verbs, adjectives, and conjunctions, etc.; and it treats them all in the same way.)

That's an important point!

And actually not always true in Lisp.

What is being said here is that when we are porting code from Scheme into Emacs Lisp, we are not writing an interpreter of Scheme in Emacs Lisp, but rather we are replacing Scheme's constructs with Emacs Lisp's.
4.026 The meanings of simple signs (words) must be explained to us if we are to understand them. With propositions, however, we make ourselves understood.

The specification of special forms must be given in a "human language" with a lot of hand-waving. Procedures, however, can be written purely formally.

(Remember that our world has no I/O, so we probably do not need any primitive procedures.)
4.027 It belongs to the essence of a proposition that it should be able to communicate a new sense to us.

That is, extract the "internal properties".

4.0.3. 4.03 A proposition must use old expressions to communicate a new sense. A proposition communicates a situation to us, and so it must be essentially connected with the situation. And the connexion is precisely that it is its logical picture. A proposition states something only in so far as it is a picture.

A subroutine is written using the language it is written in. (Rules of the world.)

The rest I do not understand.

Let's say that a proposition gives us some new information as long as it is extracting it from the "world", and therefore is "understanding" it.

It can produce a random 1 or 0, but this is not very useful.

4.031 In a proposition a situation is, as it were, constructed by way of experiment. Instead of, ‘This proposition has such and such a sense’, we can simply say, ‘This proposition represents such and such a situation’.

We can see "running a program" in some sense as an evolution of a world, in which the initial memory state is the initial state of the simulator, or the initial condition in a Cauchy (initial value) problem.

Functions, therefore, represent "what is happening" in the world, that is, a situation.
1. 4.0311 One name stands for one thing, another for another thing, and they are combined with one another. In this way the whole group — like a tableau vivant — presents a state of affairs.
  
  "a silent and motionless group of people arranged to represent a scene or incident." – tableau vivant
  
  This is kinda like…
  
  When your procedure is loaded into memory, the symbols and the parenthetical structure are transformed into an abstract syntactic tree.
  
  When you are resolving symbols, you are substituting them with the "actual things". So, a procedure, at the beginning of the evaluation, is like an image of the scene of an accident (event).
2. 4.0312 The possibility of propositions is based on the principle that objects have signs as their representatives. My fundamental idea is that the ‘logical constants’ are not representatives; that there can be no representatives of the logic of facts.
  
  "Logical constants" here do not mean bits, but rather the primitives of the programming language (or instruction set).
  
  Indeed, they are bound to be undefined and have no meaning for the machine itself, because "it just lives according to them".
  
  It is possible to implement a language in terms of another language, sure, but it at the bottom of it there will be a meaningless substrate.
4.032 It is only in so far as a proposition is logically articulated that it is a picture of a situation. (Even the proposition, ‘Ambulo’, is composite: for its stem with a different ending yields a different sense, and so does its ending with a different stem.)

We will ignore the Latin reference, it is not very useful.

I guess what is meant here is that the picture of the situation is drawn with the logical strokes.

4.0.4. 4.04 In a proposition there must be exactly as many distinguishable parts as in the situation that it represents. The two must possess the same logical (mathematical) multiplicity. (Compare Hertz’s Mechanics on dynamical models.)

The problem of "sameness" appears all over the science. As far as I remember, bosons and fermions are the most well-known example in the old science.

In programming this means that "distinguishable parts" are literally in the same memory. If they are represented by pointers, these pointers point to the same memory address.

4.041 This mathematical multiplicity, of course, cannot itself be the subject of depiction. One cannot get away from it when depicting.

I guess, this means that although the pointers have the same value, and point to the same address, the pointers themselves would inevitably be in different memory cells.
1. 4.0411 If, for example, we wanted to express what we now write as \((x).fx\) by putting an affix in front of \(fx\) — for instance by writing \(\mbox{Gen}. fx\) — it would not be adequate: we should not know what was being generalized. If we wanted to signalize it with an affix \(\mbox{\hskip}_g\) — for instance by writing \(f(x_g)\) — that would not be adequate either: we should not know the scope of the generality-sign. If we were to try to do it by introducing a mark into the argument-pieces — for instance by writing \((G,G).F(G,G)\) — it would not be adequate: we should not be able to establish the identity of the variables. And so on. All these modes of signifying are inadequate because they lack the necessary mathematical multiplicity.
  
  Wittgenstein is basically defining Scheme's \(let\) here. And/or discussing lexical scoping.
2. 4.0412 For the same reason the idealist’s appeal to ‘spatial spectacles’ is inadequate to explain the seeing of spatial relations, because it cannot explain the multiplicity of these relations.
  
  ‘Spatial spectacles’ here would probably be called ‘spatial coordinates’ by quantum mechanics. Or, maybe, "spatial representation".
  
  I think, the point here is that it is not enough to know the initial coordinates to predict behaviour. You need to also know which particle is "same" with the other particles.
  
  How can this even be? Maybe you can imagine this as having two observations of highly similar bosons in different positions. Then you add some spin to the boson 1. Does it mean that boson 2 immediately gets the same spin? May be possible if it is "the same" boson.
  
  In computers you can interpret it like this: If you (set! x 5) at some point, it does not mean that everything ever represented as x becomes 5. You need to consider the scope.

4.0.5. 4.05 Reality is compared with propositions.

I guess, we can just do it – compare reality with what the machine emits.

If this "reality" is really the physical reality, not some other Wittgensteinian concept I have missed. Let's see.

4.0.6. 4.06 A proposition can be true or false only in virtue of being a picture of reality.

I guess, it can be interpreted as "the truth only exists in a machine". Your functions may give a true or false answer only with respect to the initial tape configuration.

4.061 It must not be overlooked that a proposition has a sense that is independent of the facts: otherwise one can easily suppose that true and false are relations of equal status between signs and what they signify. In that case one could say, for example, that ‘p’ signified in the true way what ‘~p’ signified in the false way, etc.

Correctness, however, is independent off the initial tape state. If your function is expected to answer a question "are the bytes 1-14 set to 0", but instead outputs "true" or "false" at random, then it is just incorrect.
4.062 Can we not make ourselves understood with false propositions just as we have done up till now with true ones? — So long as it is known that they are meant to be false. — No! For a proposition is true if we use it to say that things stand in a certain way, and they do; and if by ‘p’ we mean ~p and things stand as we mean that they do, then, construed in the new way, ‘p’ is true and not false.

I am not entirely sure I understand the approach here… But I guess, he is saying that it is incorrect to say that by always using (not (p)) instead of (p), we can usually make a new true proposition.
1. 4.0621 But it is important that the signs ‘p’ and ‘~p’ can say the same thing. For it shows that nothing in reality corresponds to the sign ‘∼’. The occurrence of negation in a proposition is not enough to characterize its sense (~~p = p). The propositions ‘p’ and ‘~p’ have opposite sense, but there corresponds to them one and the same reality.
  
  Again, unclear to me what he is saying.
  
  The sign '~' can certainly be redefined, for example, to identity.
  
  Again, the underlying reality is not affected by our usage of (p) or (not (p)).
4.063 An analogy to illustrate the concept of truth: imagine a black spot on white paper: you can describe the shape of the spot by saying, for each point on the sheet, whether it is black or white. To the fact that a point is black there corresponds a positive fact, and to the fact that a point is white (not black), a negative fact. If I designate a point on the sheet (a truth-value according to Frege), then this corresponds to the supposition that is put forward for judgement, etc. etc. But in order to be able to say that a point is black or white, I must first know when a point is called black, and when white: in order to be able to say, ‘“p” is true (or false)’, I must have determined in what circumstances I call ‘p’ true, and in so doing I determine the sense of the proposition. Now the point where the simile breaks down is this: we can indicate a point on the paper even if we do not know what black and white are, but if a proposition has no sense, nothing corresponds to it, since it does not designate a thing (a truth-value) which might have properties called ‘false’ or ‘true’. The verb of a proposition is not ‘is true’ or ‘is false’, as Frege thought: rather, that which ‘is true’ must already contain the verb.

I guess, in our case the tape has ones and zeros, but a-priori we cannot tell whether 1 is true, or 0 is true?

Perhaps, Wittgenstein is trying to convey to us the same idea that Abelson-Sussman in SICP describe with the (false? x) procedure.

Indeed, our interpreter has to have some intrinsic understanding of "truth". See section 4.1.3 Evaluator Data Structures of SICP.
4.064 Every proposition must already have a sense: it cannot be given a sense by affirmation. Indeed its sense is just what is affirmed. And the same applies to negation, etc.

Well, in Scheme we can usually extend syntax to some extent…

Unclear.

However, maybe it just means that unless your code is syntactically correct as it is interpreted, your machine will just "not work".

This sounds a bit pessimistic, though. Kind of meaning that a purely logical machine will not be able to exceed its own limits.
1. 4.0641 One could say that negation must be related to the logical place determined by the negated proposition. The negating proposition determines a logical place different from that of the negated proposition. The negating proposition determines a logical place with the help of the logical place of the negated proposition. For it describes it as lying outside the latter’s logical place. The negated proposition can be negated again, and this in itself shows that what is negated is already a proposition, and not merely something that is preliminary to a proposition.
  
  Well, "preliminary to a proposition" is totally unclear. Wittgenstein never used such a thing before.
  
  Why is he so concerned with negation?

4.1. 4.1 Propositions represent the existence and non-existence of states of affairs.

Or, configurations of bytes with respect to each other.

4.1.1. 4.11 The totality of true propositions is the whole of natural science (or the whole corpus of the natural sciences).

Simply put, "a natural science is anything for which a robot can be made to answer questions".

4.111 Philosophy is not one of the natural sciences. (The word ‘philosophy’ must mean something whose place is above or below the natural sciences, not beside them.)

Above or below?

If philosophy is the area of thought about writing thoughts.

Or, maybe, about "encoding reality" in such a way that robots, then acting purely logically, would be able to do natural science.
4.112 Philosophy aims at the logical clarification of thoughts. Philosophy is not a body of doctrine but an activity. A philosophical work consists essentially of elucidations. Philosophy does not result in ‘philosophical propositions’, but rather in the clarification of propositions. Without philosophy thoughts are, as it were, cloudy and indistinct: its task is to make them clear and to give them sharp boundaries.

This also seems a lot like "encoding reality".
1. 4.1121 Psychology is no more closely related to philosophy than any other natural science. Theory of knowledge is the philosophy of psychology. Does not my study of sign-language correspond to the study of thought-processes, which philosophers used to consider so essential to the philosophy of logic? Only in most cases they got entangled in unessential psychological investigations, and with my method too there is an analogous risk.
  
  So, "theory of knowledge" is what is encoding the input for the psychologists (psychological algorithms) to work.
  
  I'm still thinking that it should be possible to make a better justification for psychology than just "encoding thoughts", perhaps, in physiology. But Wittgenstein didn't know much about psycho-physiology.
2. 4.1122 Darwin’s theory has no more to do with philosophy than any other hypothesis in natural science.
  
  Sure.
4.113 Philosophy sets limits to the much disputed sphere of natural science.

Aha, so philosophy is in "encoding the world", and "natural science" is then answering the questions about this encoding.

I guess, in commercial software this means that "analysis" is a philosophical job, and "implementation" is, then can be done in a purely scientific way.
4.114 It must set limits to what can be thought; and, in doing so, to what cannot be thought. It must set limits to what cannot be thought by working outwards through what can be thought.

Hm…

So, we're digitising/encoding reality, by philosophising, and at some point we are encountering a "Lower Bound" do this digitisation.
4.115 It will signify what cannot be said, by presenting clearly what can be said.

And it should also tell why our digitisation will fail if we try to go further.
4.116 Everything that can be thought at all can be thought clearly. Everything that can be put into words can be put clearly.

Hm… what about those ugly partially convergent functions? Those that can give us a response in many of the cases, but not all? Uncomputable, undecidable? Kolmogorov Complexity, for example?

4.1.2. 4.12 Propositions can represent the whole of reality, but they cannot represent what they must have in common with reality in order to be able to represent it — logical form. In order to be able to represent logical form, we should have to be able to station ourselves with propositions somewhere outside logic, that is to say outside the world.

Well, "design is code", right? If you specify your procedure well enough, you do not need to write it, you already have it?

But what about performance?

Also, it seems that he is claiming that an "Electric Programmer" is not possible, because logical synthesis is, apparently, not a clearly defined process?

4.121 Propositions cannot represent logical form: it is mirrored in them. What finds its reflection in language, language cannot represent. What expresses itself in language, we cannot express by means of language. Propositions show the logical form of reality. They display it.

I think that he is missing the "Metacircular Interpreter" discussion.

For sure, there is this last layer, at which you have to express a language in the language of the machine, and the primitives cannot be decomposed further.

But metacircularity still needs a discussion.
1. 4.1211 Thus one proposition ‘fa’ shows that the object a occurs in its sense, two propositions ‘fa’ and ‘ga’ show that the same object is mentioned in both of them. If two propositions contradict one another, then their structure shows it; the same is true if one of them follows from the other. And so on.
  
  Again, I think that some metalanguage reasoning systems, formal methods and such, can compute that inference in some cases.
2. 4.1212 What can be shown, cannot be said.
  
  Maybe, "not necessarily".
3. 4.1213 Now, too, we understand our feeling that once we have a sign-language in which everything is all right, we already have a correct logical point of view.
  
  I think that this means that we need a language that is correct and not self-contradictory, then any other language can be reinterpreted (reimplemented) using it.
4.122 In a certain sense we can talk about formal properties of objects and states of affairs, or, in the case of facts, about structural properties: and in the same sense about formal relations and structural relations. (Instead of ‘structural property’ I also say ‘internal property’; instead of ‘structural relation’, ‘internal relation’. I introduce these expressions in order to indicate the source of the confusion between internal relations and relations proper (external relations), which is very widespread among philosophers.) It is impossible, however, to assert by means of propositions that such internal properties and relations obtain: rather, this makes itself manifest in the propositions that represent the relevant states of affairs and are concerned with the relevant objects.

I guess, he wants to define what is external and what is internal here?

But he is not actually saying what an external property is?
1. 4.1221 An internal property of a fact can also be called a feature of that fact (in the sense in which we speak of facial features, for example).
  
  And those "features" are almost the same as the "features" in machine learning.
4.123 A property is internal if it is unthinkable that its object should not possess it. (This shade of blue and that one stand, eo ipso, in the internal relation of lighter to darker. It is unthinkable that these two objects should not stand in this relation.) (Here the shifting use of the word ‘object’ corresponds to the shifting use of the words ‘property’ and ‘relation’.)

So, this is kinda easy?

There bits represent a picture. No matter whether we want to make our algorithm distinguish pictures of cats from pictures of dogs, or just display a wallpaper, these bits are still thought to be a picture, not an audio wave.

So, "being an image" is an internal property of an array of bits.

I guess, "having a cat image in it" should be an external property?
4.124 The existence of an internal property of a possible situation is not expressed by means of a proposition: rather, it expresses itself in the proposition representing the situation, by means of an internal property of that proposition. It would be just as nonsensical to assert that a proposition had a formal property as to deny it.

Indeed, it is nonsensical to run an image recognising predicate on something that represents a sound wave.
1. 4.1241 It is impossible to distinguish forms from one another by saying that one has this property and another that property: for this presupposes that it makes sense to ascribe either property to either form.
  
  This I do not really understand. Perhaps, the idea here is that functions which work with byte data can "swallow" both an image representation, and a wavefront representation, and give "some" result. We wouldn't know whether it is correct, unless we know what the bytes actually stand for.
4.125 The existence of an internal relation between possible situations expresses itself in language by means of an internal relation between the propositions representing them.

I guess, this is speaking about one level of abstraction higher.

Say, a picture is a picture at address A is of a wavefront, which is itself digitised and placed at address B. There certainly may be a relationship between propositions operating on them.
1. 4.1251 Here we have the answer to the vexed question ‘whether all relations are internal or external’.
  
  And?
2. 4.1252 I call a series that is ordered by an internal relation a series of forms. The order of the number-series is not governed by an external relation but by an internal relation. The same is true of the series of propositions {‘aRb’, ‘(∃x):aRx.xRb’, ‘(∃x, y):aRx.xRy.yRb’,} and so forth. (If b stands in one of these relations to a, I call b a successor of a.)
  
  Isn't this the Church encoding, or something?
  
  Why is it an internal relation? A full order relation… seems to be derive-able from +1.
4.126 We can now talk about formal concepts, in the same sense that we speak of formal properties. (I introduce this expression in order to exhibit the source of the confusion between formal concepts and concepts proper, which pervades the whole of traditional logic.) When something falls under a formal concept as one of its objects, this cannot be expressed by means of a proposition. Instead it is shown in the very sign for this object. (A name shows that it signifies an object, a sign for a number that it signifies a number, etc.) Formal concepts cannot, in fact, be represented by means of a function, as concepts proper can. For their characteristics, formal properties, are not expressed by means of functions. The expression for a formal property is a feature of certain symbols. So the sign for the characteristics of a formal concept is a distinctive feature of all symbols whose meanings fall under the concept. So the expression for a formal concept is a propositional variable in which this distinctive feature alone is constant.

When a formal concept represents a thing, it cannot be expressed as a proposition.

I think this is quite understandable.

A can in a picture (represented by a byte array) of a cat, has a formal concept of being a cat. We can represent this as a proposition (concept proper), but this proposition cannot be guaranteed to be 100% accurate (because, naturally, recognising images is hard!).
4.127 The propositional variable signifies the formal concept, and its values signify the objects that fall under the concept.

I am not very sure what this means.
1. 4.1271 Every variable is the sign for a formal concept. For every variable represents a constant form that all its values possess, and this can be regarded as a formal property of those values.
  
  Kinda… if we formally substitute and expand everything… And start seeing functions as operating on whole sets?
  
  Isn't it that in machines "formal concepts" are only 1 and 2?
2. 4.1272 Thus the variable name ‘x’ is the proper sign for the pseudo-concept object. Wherever the word ‘object’ (‘thing’, etc.) is correctly used, it is expressed in conceptual notation by a variable name. For example, in the proposition, ‘There are 2 objects which…’, it is expressed by ‘(∃x, y)…’. Wherever it is used in a different way, that is as a proper concept-word, nonsensical pseudo-propositions are the result. So one cannot say, for example, ‘There are objects’, as one might say, ‘There are books’. And it is just as impossible to say, ‘There are 100 objects’, or, ‘There are χ_0 objects’. And it is nonsensical to speak of the total number of objects. The same applies to the words ‘complex’, ‘fact’, ‘function’, ‘number’, etc. They all signify formal concepts, and are represented in conceptual notation by variables, not by functions or classes (as Frege and Russell believed). ‘1 is a number’, ‘There is only one zero’, and all similar expressions are nonsensical. (It is just as nonsensical to say, ‘There is only one 1’, as it would be to say, ‘2+2 at 3 o’clock equals 4’.)
  
  No, I do not understand that.
  
  If we use Church-encoding, we can avoid using primitive numbers.
  
  And these will be "different" ones in the different propositions.
  
  However, maybe he is speaking about "interning", functions, numbers, strings?
  
  Or, when speaking about "machine learning", about labels of the objects?
  
  It would make sense to intern labels?
  1. 4.12721 A formal concept is given immediately any object falling under it is given. It is not possible, therefore, to introduce as primitive ideas objects belonging to a formal concept and the formal concept itself. So it is impossible, for example, to introduce as primitive ideas both the concept of a function and specific functions, as Russell does; or the concept of a number and particular numbers.
    
    In Lisp we seem to have an opposite view. There are "interned symbols" and "uninterned symbols".
    
    The key here, I guess, is "primitive ideas".
    
    I think, Wittgenstein stresses the word "primitive". It is possible to introduce both "numbers", and "number 1", but one of those has to be non-primitive.
3. 4.1273 If we want to express in conceptual notation the general proposition, ‘b is a successor of a’, then we require an expression for the general term of the series of forms {aRb, (∃x):aRx.xRb, (∃x, y):aRx.xRy.yRb,…}. In order to express the general term of a series of forms, we must use a variable, because the concept ‘term of that series of forms’ is a formal concept. (This is what Frege and Russell overlooked: consequently the way in which they want to express general propositions like the one above is incorrect; it contains a vicious circle.) We can determine the general term of a series of forms by giving its first term and the general form of the operation that produces the next term out of the proposition that precedes it.
  
  This is kind of like trying to invent a lambda-expression, while not yet knowing what a lambda is. Trying to encode a function in a formal notation.
  
  Poor Wittgenstein, born too early.
4. 4.1274 To ask whether a formal concept exists is nonsensical. For no proposition can be the answer to such a question. (So, for example, the question, ‘Are there unanalysable subject-predicate propositions?’ cannot be asked.)
  
  I think this means that the usage of something in a piece of code implies semantical existence of this something.
  
  When we use a label in a machine-learning program, say, to distinguish chairs from tables, we almost imply that the very concept of chairs exists, since we use it in our code.
4.128 Logical forms are without number. Hence there are no pre-eminent numbers in logic, and hence there is no possibility of philosophical monism or dualism, etc.

Why are they without number? Alphabets generally generate a countable number of words.

What does "no possibility of philosophical monism or dualism"? I remember that monism and dualism refer to the unity or separateness of brain and body.

4.2. 4.2 The sense of a proposition is its agreement and disagreement with possibilities of existence and non-existence of states of affairs.

That's a bit Popperian "verifiability" and "falsifiability".

A procedure has sense if it makes some inference about the input data, what is and what is not the case.

4.2.1. 4.21 The simplest kind of proposition, an elementary proposition, asserts the existence of a state of affairs.

Seems to be, essentially, leaving input untouched and always returning "true".

4.211 It is a sign of a proposition’s being elementary that there can be no elementary proposition contradicting it.

Unless your computation system is self-contradictory?

4.2.2. 4.22 An elementary proposition consists of names. It is a nexus, a concatenation, of names.

Not obvious.

Maybe I am overthinking it? An elementary proposition is just a state of the bit N in the input? Or, maybe, subset of bits.

4.221 It is obvious that the analysis of propositions must bring us to elementary propositions which consist of names in immediate combination. This raises the question how such combination into propositions comes about.

Not obvious. It is true that propositions may be symbolic expressions, and if they obey certain syntax, they will, essentially consist of juxtaposed variables… and function calls. These function calls are important.
1. 4.2211 Even if the world is infinitely complex, so that every fact consists of infinitely many states of affairs and every state of affairs is composed of infinitely many objects, there would still have to be objects and states of affairs.
  
  This is supposed to be a commentary to the "law of forming propositions". Seems not very useful.

4.2.3. 4.23 It is only in the nexus of an elementary proposition that a name occurs in a proposition.

Mumbo-jumbo.

Does it mean that an "elementary proposition" has no variables, essentially?

4.2.4. 4.24 Names are the simple symbols: I indicate them by single letters (‘x’, ‘y’, ‘z’). I write elementary propositions as functions of names, so that they have the form ‘fx’, ‘ φ (x,y)’, etc. Or I indicate them by the letters ‘p’, ‘q’, ‘r’.

Okay, fairly standard notation.

4.241 When I use two signs with one and the same meaning, I express this by putting the sign ‘=’ between them. So ‘a = b’ means that the sign ‘b’ can be substituted for the sign ‘a’. (If I use an equation to introduce a new sign ‘b’, laying down that it shall serve as a substitute for a sign ‘a’ that is already known, then, like Russell, I write the equation—definition—in the form ‘a = b Def.’ A definition is a rule dealing with signs.)

There is some caveat here between evaluation and substitution, defun and defmacro.

In his case, "=" is more like "eqv?", than "eq?".
4.242 Expressions of the form ‘a = b’ are, therefore, mere representational devices. They state nothing about the meaning of the signs ‘a’ and ‘b’.

An ‘a = b’ can also be false?
4.243 Can we understand two names without knowing whether they signify the same thing or two different things? — Can we understand a proposition in which two names occur without knowing whether their meaning is the same or different? Suppose I know the meaning of an English word and of a German word that means the same: then it is impossible for me to be unaware that they do mean the same; I must be capable of translating each into the other. Expressions like ‘a = a’, and those derived from them, are neither elementary propositions nor is there any other way in which they have sense. (This will become evident later.)

So, he is more or less trying to define logical equality in this subchapter?

In Scheme we have "eq?", "eqv?", "equal?" to represent "same" as in "same place in memory", "almost the same", and "equivalent in meaning".

4.2.5. 4.25 If an elementary proposition is true, the state of affairs exists: if an elementary proposition is false, the state of affairs does not exist.

Why not the other way round?

Perhaps, if we only speak about bits, then "true" would be equivalent to 1, and "false" to 0.

4.2.6. 4.26 If all true elementary propositions are given, the result is a complete description of the world. The world is completely described by giving all elementary propositions, and adding which of them are true and which false.

Ah! I didn't understand that.

For Wittgenstein, not all of the underlying structure of the world accessible through input. Some memory cells are inaccessible as individual cells. (But, I guess, are accessible through propositions acting on blocks.)

4.2.7. 4.27 For n states of affairs, there are \(K_n=\sum_{v=0}^n \binom{n}{v}\) possibilities of existence and non-existence. Of these states of affairs any combination can exist and the remainder not exist.

At this point something start worrying me about Wittgenstein's mathematical skills.

The sum is actually \(2^n\). And this is in perfect accord with \(2^n\) possible configurations of bits. It seems that his "states of affairs" are really bits.

4.2.8. 4.28 There correspond to these combinations the same number of possibilities of truth—and falsity—for n elementary propositions.

Yeah, again, basically, input.

4.3. 4.3 Truth-possibilities of elementary propositions mean possibilities of existence and non-existence of states of affairs.

Again, basically, bits.

4.3.1. 4.31 We can represent truth-possibilities by schemata of the following kind (‘T’ means ‘true’, ‘F’ means ‘false’; the rows of ‘T’s’ and ‘F’s’ under the row of elementary propositions symbolize their truth-possibilities in a way that can easily be understood):

p	q	r
T	T	T
F	T	T
T	F	T
T	T	F
F	F	T
F	T	F
T	F	F
F	F	F

p	q
T	T
F	T
T	F
F	F

p
T
F

Oh, right, the first three binoms.

4.4. 4.4 A proposition is an expression of agreement and disagreement with truth-possibilities of elementary propositions.

Basically, saying that you can turn any boolean function into a conjunction of disjunctions.

4.4.1. 4.41 Truth-possibilities of elementary propositions are the conditions of the truth and falsity of propositions.

Again, since we have no input and no randomness, input is the only thing left.

4.411 It immediately strikes one as probable that the introduction of elementary propositions provides the basis for understanding all other kinds of proposition. Indeed the understanding of general propositions palpably depends on the understanding of elementary propositions.

Not very clear.

But clearly, all computation should eventually reduce to operations on bits.

4.4.2. TODO 4.42 For n elementary propositions there are \(\sum_{k=0}^{K_n}\left(\frac{K_n}{k}\right)=L_n\) ways in which a proposition can agree and disagree with their truth-possibilities

We already know that \(K_n=2^n\), that's all possible bit-arrays.

Is that even true combinatorially?

How did he get this number, proponent of clarity?

Is it the maximal number of clauses in an conjunctive form? A proposition can be conditioned on 1 to n variables, and the i'th subset of those 1 to n variables can be in \(2^i\) configurations, each of which may or may not deliver truthfulness to the proposition.

4.4.3. 4.43 We can express agreement with truth-possibilities by correlating the mark ‘T’ (true) with them in the schema. The absence of this mark means disagreement.

Ok?

4.431 The expression of agreement and disagreement with the truth-possibilities of elementary propositions expresses the truth-conditions of a proposition. A proposition is the expression of its truth conditions. (Thus Frege was quite right to use them as a starting point when he explained the signs of his conceptual notation. But the explanation of the concept of truth that Frege gives is mistaken: if ‘the true’ and ‘the false’ were really objects, and were the arguments in ~p etc., then Frege’s method of determining the sense of ‘~p’ would leave it absolutely undetermined.)

This is almost again, reference to the "disjunctions of conjunctions" form (that "expression").

4.4.4. 4.44 The sign that results from correlating the mark ‘T’ with truth-possibilities is a propositional sign.

"Formula?"

In our case it would be "a program".

4.441 It is clear that a complex of the signs ‘F’ and ‘T’ has no object (or complex of objects) corresponding to it, just as there is none corresponding to the horizontal and vertical lines or to the brackets. — There are no ‘logical objects’. Of course the same applies to all signs that express what the schemata of ‘T’s’ and ‘F’s’ express.

Does he mean his own peculiar definition of objects, or objects in general?

Or, is he saying, again, that only human interpretation gives meaning to Ts and Fs?
4.442 For example, the following is a propositional sign:@ (Frege’s ‘judgement-stroke’ ‘|–’ is logically quite meaningless: in the works of Frege (and Russell) it simply indicates that these authors hold the propositions marked with this sign to be true. Thus ‘|–’ is no more a component part of a proposition than is, for instance, the proposition’s number. It is quite impossible for a proposition to state that it itself is true.) If the order of the truth-possibilities in a schema is fixed once and for all by a combinatory rule, then the last column by itself will be an expression of the truth-conditions. If we now write this column as a row, the propositional sign will become ‘(TT-T) (p,q)’ or more explicitly ‘(TTFT) (p,q)’. (The number of places in the left-hand pair of brackets is determined by the number of terms in the right-hand pair.)

@=

p q

T T T

F T T

T F

F F T

This seems to be just his peculiar notation. Yes, if you fix the order of variables, you can define a predicate as a bit sub-set on the bit set representing all possible states of your input and do a table lookup, instead of computing the value.

4.4.5. 4.45 For n elementary propositions there are L n possible groups of truth-conditions. The groups of truth-conditions that are obtainable from the truth-possibilities of a given number of elementary propositions can be arranged in a series.

Yeah, elements in the disjunctive form.

4.4.6. 4.46 Among the possible groups of truth-conditions there are two extreme cases. In one of these cases the proposition is true for all the truth-possibilities of the elementary propositions. We say that the truth-conditions are `tautological`. In the second case the proposition is false for all the truth-possibilities: the truth-conditions are `contradictory`. In the first case we call the proposition a tautology; in the second, a contradiction.

Again, first-year mathematical logic course.

Tautologies are also called "laws of logic".

4.461 Propositions show what they say: tautologies and contradictions show that they say nothing. A tautology has no truth-conditions, since it is unconditionally true: and a contradiction is true on no condition. Tautologies and contradictions lack sense. (Like a point from which two arrows go out in opposite directions to one another.) (For example, I know nothing about the weather when I know that it is either raining or not raining.)

Indeed, but tautologies may also serve as a way to simplify expressions (reduce formulas).

Contradictions may serve as compile-time checkers of correctness.
1. 4.4611 Tautologies and contradictions are not, however, nonsensical. They are part of the symbolism, much as ‘0’ is part of the symbolism of arithmetic.
  
  So, supposedly, we are interested in creating "theorems", that is, tautologies with a colossal number of input bits, that are hard to infer by brute-force.
4.462 Tautologies and contradictions are not pictures of reality. They do not represent any possible situations. For the former admit all possible situations, and the latter none. In a tautology the conditions of agreement with the world — the representational relations — cancel one another, so that it does not stand in any representational relation to reality.

I guess, he implies that logic is above reality here.
4.463 The truth-conditions of a proposition determine the range that it leaves open to the facts. (A proposition, a picture, or a model is, in the negative sense, like a solid body that restricts the freedom of movement of others, and, in the positive sense, like a space bounded by solid substance in which there is room for a body.) A tautology leaves open to reality the whole—the infinite whole—of logical space: a contradiction fills the whole of logical space leaving no point of it for reality. Thus neither of them can determine reality in any way.

I really like this metaphor.

I can also relate this to an operator defining the behaviour of a system. You take an initial condition (input), and evolve it.
4.464 A tautology’s truth is certain, a proposition’s possible, a contradiction’s impossible. (Certain, possible, impossible: here we have the first indication of the scale that we need in the theory of probability.)

That is a very nice link.
4.465 The logical product of a tautology and a proposition says the same thing as the proposition. This product, therefore, is identical with the proposition. For it is impossible to alter what is essential to a symbol without altering its sense.

What is a "logical product"? I guess, rewriting a function using an information-preserving rule (tautology)? However, without altering its sense, may still mean "making it a lot faster".
4.466 What corresponds to a determinate logical combination of signs is a determinate logical combination of their meanings. It is only to the uncombined signs that absolutely any combination corresponds. In other words, propositions that are true for every situation cannot be combinations of signs at all, since, if they were, only determinate combinations of objects could correspond to them. (And what is not a logical combination has no combination of objects corresponding to it.) Tautology and contradiction are the limiting cases — indeed the disintegration — of the combination of signs.

I am not very sure what he means by "combination" here. May be a combination in the Scheme sense, a computational graph, or just a multidimensional (in the sense of run in parallel) single-bit propositions.
1. 4.4661 Admittedly the signs are still combined with one another even in tautologies and contradictions—i.e. they stand in certain relations to one another: but these relations have no meaning, they are not essential to the symbol.
  
  So, you can write a procedure with a lot of operations in it, but if it always produces #t, or #f, all of that operations are in vain.
  
  Optimise them out!

4.5. 4.5 It now seems possible to give the most general propositional form: that is, to give a description of the propositions of `any` sign-language `whatsoever` in such a way that every possible sense can be expressed by a symbol satisfying the description, and every symbol satisfying the description can express a sense, provided that the meanings of the names are suitably chosen. It is clear that `only` what is essential to the most general propositional form may be included in its description — for otherwise it would not be the most general form. The existence of a general propositional form is proved by the fact that there cannot be a proposition whose form could not have been foreseen (i.e. constructed). The general form of a proposition is: This is how things stand.

"This is how things stand."

Seems true, but useless?

In other words, there is a function that produces all of the information about the world described in the input – it just outputs all the input verbatim.

4.5.1. 4.51 Suppose that I am given `all` elementary propositions: then I can simply ask what propositions I can construct out of them. And there I have `all` propositions, and that fixes their limits.

This statement is a little shaky with infinite inputs.

But if input is finite, then all possible conjunctive forms on the input length − all possible propositions will give the list of all propositions.

Again, their number is colossal.

4.5.2. 4.52 Propositions comprise all that follows from the totality of all elementary propositions (and, of course, from its being the `totality` of them `all`). (Thus, in a certain sense, it could be said that all propositions were generalizations of elementary propositions.)

Seems clear. All meaningful functions on N bits, we can pre-compute them all, and relax.

4.5.3. 4.53 The general propositional form is a variable.

And then we can use that "seemingly, existing" function output in new computation?

Is that what he means?

5. 5 A proposition is a truth-function of elementary propositions. (An elementary proposition is a truth-function of itself.)

Because everything is eventually boolean and binary. Sort of.

5.0.1. 5.01 Elementary propositions are the truth-arguments of propositions.

Because functions have to work on something.

5.0.2. 5.02 The arguments of functions are readily confused with the affixes of names. For both arguments and affixes enable me to recognize the meaning of the signs containing them. For example, when Russell writes ‘\(+_{_c}\)’, the ‘ \(\mbox{ }_c\) ’ is an affix which indicates that the sign as a whole is the addition-sign for cardinal numbers. But the use of this sign is the result of arbitrary convention and it would be quite possible to choose a simple sign instead of ‘\(+_c\)’; in ‘~p’, however, ‘p’ is not an affix but an argument: the sense of ‘~p’ `cannot` be understood unless the sense of ‘p’ has been understood already. (In the name Julius Caesar ‘Julius’ is an affix. An affix is always part of a description of the object to whose name we attach it: e.g. `the` Caesar of the Julian gens.) If I am not mistaken, Frege’s theory about the meaning of propositions and functions is based on the confusion between an argument and an affix. Frege regarded the propositions of logic as names, and their arguments as the affixes of those names.

This is a nice discussion of the difference between names and applications.

Something that seems totally obvious to programmers.

With generalised functions, though, you could have (+ 'cardinal a b)

5.1. 5.1 Truth-functions can be arranged in series. That is the foundation of the theory of probability.

Unclear.

Theory of probability is, in fact, totally deterministic, if we take distributions as the basic property of existence.

5.101 The truth-functions of a given number of elementary propositions can always be set out in a schema of the following kind:@ I will give the name truth-grounds of a proposition to those truth-possibilities of its truth-arguments that make it true.

(T T T T) (p, q) Tautology (If p then p, and if q then q.) (p ⊃ p . q ⊃ q) (F T T T) (p, q) In words: Not both p and q. (~(p . q)) (T F T T) (p, q) ,, ,, : If q then p. (q ⊃ p) (T T F T) (p, q) ,, ,, : If p then q. (p ⊃ q) (T T T F) (p, q) ,, ,, : p or q. (p v q) (F F T T) (p, q) ,, ,, : Not q. (~q) (F T F T) (p, q) ,, ,, : Not p. (~p) (F T T F) (p, q) ,, ,, : p or q, but not both. (p.~q : v : q.~p) (T F F T) (p, q) ,, ,, : If p then q, and if q then p. (p ≡ q) (T F T F) (p, q) ,, ,, : p (T T F F) (p, q) ,, ,, : q (F F F T) (p, q) ,, ,, : Neither p nor q. (~p .~q or p|q) (F F T F) (p, q) ,, ,, : p and not q. (p . ~q) (F T F F) (p, q) ,, ,, : q and not p. (q . ∼p) (T F F F) (p, q) ,, ,, : q and p. (q . p) (F F F F) (p, q) Contradiction (p and not p, and q and not q.) (p . ~p . q . ~q)

I guess, this is a first step in rewriting conjunctive forms into something algorithmic.

truth-grounds seem to mean "subset of input when the function is true".

5.1.1. 5.11 If all the truth-grounds that are common to a number of propositions are at the same time truth-grounds of a certain proposition, then we say that the truth of that proposition follows from the truth of the others.

We should be able to avoid computing the second function then.

5.1.2. 5.12 In particular, the truth of a proposition ‘p’ follows from the truth of another proposition ‘q’ if all the truth-grounds of the latter are truth-grounds of the former.

Same.

5.121 The truth-grounds of the one are contained in those of the other: p follows from q.

Seems like another optimisation opportunity.
5.122 If p follows from q, the sense of ‘p’ is contained in the sense of ‘q’.

What about the cases when p→q, but not always the opposite?
5.123 If a god creates a world in which certain propositions are true, then by that very act he also creates a world in which all the propositions that follow from them come true. And similarly he could not create a world in which the proposition ‘p’ was true without creating all its objects.

I guess, this should urge us to believe that we can find those “propositions that follow”.
5.124 A proposition affirms every proposition that follows from it.

Okay… Not sure I understand why this is important.
1. 5.1241 ‘p. q’ is one of the propositions that affirm ‘p’ and at the same time one of the propositions that affirm ‘q’. Two propositions are opposed to one another if there is no proposition with a sense, that affirms them both. Every proposition that contradicts another negates it.
  
  This is how two propositions may behave on the same input.

5.1.3. 5.13 When the truth of one proposition follows from the truth of others, we can see this from the structure of the propositions.

Not sure I agree. Suppose q is true if p is true, but never uses p directly as a call. Then the structures are unlikely to show us the connection.

Perhaps, if "can" is seen as "it is possible, although may be computationally expensive", then I am fine.

5.131 If the truth of one proposition follows from the truth of others, this finds expression in relations in which the forms of the propositions stand to one another: nor is it necessary for us to set up these relations between them, by combining them with one another in a single proposition; on the contrary, the relations are internal, and their existence is an immediate result of the existence of the propositions.

Since that "follows" relation only depends on the input.
1. 5.1311 When we infer q from p ∨ q and ~p, the relation between the propositional forms of ‘p v q’ and ‘~p’ is masked, in this case, by our mode of signifying. But if instead of ‘p v q’ we write, for example, ‘p|q.|.p|q’, and instead of ‘~p’, ‘p|p’ (p|q = neither p nor q), then the inner connexion becomes obvious. (The possibility of inference from (x).fx to fa shows that the symbol (x).fx itself has generality in it.)
  
  His notation is ugly. But generally he is saying the same thing – logical connections are defined by operations on input, not structures of procedures.
5.132 If p follows from q, I can make an inference from q to p, deduce p from q. The nature of the inference can be gathered only from the two propositions. They themselves are the only possible justification of the inference. ‘Laws of inference’, which are supposed to justify inferences, as in the works of Frege and Russell, have no sense, and would be superfluous.

This almost seems as if he is suggesting that interpreters should be written in themselves.
5.133 All deductions are made a priori.

Meaning, not conditioned on input? Because for full generality, deductions should be input-independent? (Be tautologies?)
5.134 One elementary proposition cannot be deduced from another.

Of course, because they are input.
5.135 There is no possible way of making an inference from the existence of one situation to the existence of another, entirely different situation.

Different inputs are just different inputs.
5.136 There is no causal nexus to justify such an inference.

Even if the inputs we want to process are not evenly distributed, we can reduce them to inputs that are evenly distributed.
1. 5.1361 We cannot infer the events of the future from those of the present. Superstition is nothing but belief in the causal nexus.
  
  I think, this should be seen as: "Do not confuse system evolution and different inputs."
2. 5.1362 The freedom of the will consists in the impossibility of knowing actions that still lie in the future. We could know them only if causality were an inner necessity like that of logical inference. — The connexion between knowledge and what is known is that of logical necessity. (‘A knows that p is the case’, has no sense if p is a tautology.)
  
  An interesting thought!
  
  I guess, we can see "freedom of will" as "true I/O". Choosing the next inputs.
3. 5.1363 If the truth of a proposition does not follow from the fact that it is self-evident to us, then its self-evidence in no way justifies our belief in its truth.
  
  Hehe.
  
  A political statement.

5.1.4. 5.14 If one proposition follows from another, then the latter says more than the former, and the former less than the latter.

But, again, their computational complexity may be different.

5.141 If p follows from q and q from p, then they are one and the same proposition.

Logically, yes. Computationally, though, speed, again, is the key. Furthermore, we may have different places in memory corresponding to two identically behaving procedures.
5.142 A tautology follows from all propositions: it says nothing.

He is repeating himself. We can use tautologies for optimisation and rewriting.
5.143 Contradiction is that common factor of propositions which no proposition has in common with another. Tautology is the common factor of all propositions that have nothing in common with one another. Contradiction, one might say, vanishes outside all propositions: tautology vanishes inside them. Contradiction is the outer limit of propositions: tautology is the unsubstantial point at their centre.

That is a metaphorical statement? What is a "common factor"?

5.1.5. 5.15 If \(T_r\) is the number of the truth-grounds of a proposition ‘r’, and if \(T_{rs}\) is the number of the truth-grounds of a proposition ‘s’ that are at the same time truth-grounds of ‘r’, then we call the ratio \(T_{rs} : T_{r}\) the degree of probability that the proposition ‘r’ gives to the proposition ‘s’.

That's a bit markovian in spirit. That's, again, if we assume that inputs are uniformly distributed.

5.151 In a schema like the one above in 5.101, let \(T_{r}\) be the number of ‘T’s’ in the proposition r, and let \(T_{rs}\) be the number of ‘T’s’ in the proposition s that stand in columns in which the proposition r has ‘T’s’. Then the proposition r gives to the proposition s the probability \(T_{rs} : T_r \) .

An example, ok.
1. 5.1511 There is no special object peculiar to probability propositions.
  
  We can compute probabilities by simulating successes and failures?
5.152 When propositions have no truth-arguments in common with one another, we call them independent of one another. Two elementary propositions give one another the probability \(\frac{1}{2}\). If p follows from q, then the proposition ‘q’ gives to the proposition ‘p’ the probability 1. The certainty of logical inference is a limiting case of probability. (Application of this to tautology and contradiction.)

Fine, as long as there is some intrinsic uniformity to the input. (Maybe the non-uniformity of input is called "luck".)
5.153 In itself, a proposition is neither probable nor improbable. Either an event occurs or it does not: there is no middle way.

Again, this depends essentially on the input.
5.154 Suppose that an urn contains black and white balls in equal numbers (and none of any other kind). I draw one ball after another, putting them back into the urn. By this experiment I can establish that the number of black balls drawn and the number of white balls drawn approximate to one another as the draw continues. So this is not a mathematical truth. Now, if I say, ‘The probability of my drawing a white ball is equal to the probability of my drawing a black one’, this means that all the circumstances that I know of (including the laws of nature assumed as hypotheses) give no more probability to the occurrence of the one event than to that of the other. That is to say, they give each the probability \(\frac{1}{2}\), as can easily be gathered from the above definitions. What I confirm by the experiment is that the occurrence of the two events is independent of the circumstances of which I have no more detailed knowledge.

Here Wittgenstein is trying to justify statistics rather than probability. Still, not very convincing, but let it be.
5.155 The minimal unit for a probability proposition is this: The circumstances — of which I have no further knowledge — give such and such a degree of probability to the occurrence of a particular event.

Not very rigorously defined.
5.156 It is in this way that probability is a generalization. It involves a general description of a propositional form. We use probability only in default of certainty — if our knowledge of a fact is not indeed complete, but we do know something about its form. (A proposition may well be an incomplete picture of a certain situation, but it is always a complete picture of something.) A probability proposition is a sort of excerpt from other propositions.

So, a probabilistic proposition is a proposition about other propositions. Isn't this meta-logic, again? Your propositions must be in the memory.

This "complete picture of something" is important.

5.2. 5.2 The structures of propositions stand in internal relations to one another.

Let's think about this for a moment. What is a "structure of a proposition"?

"Internal relations" are the ones determined by the structures themselves, rather than other propositions.

5.2.1. 5.21 In order to give prominence to these internal relations we can adopt the following mode of expression: we can represent a proposition as the result of an operation that produces it out of other propositions (which are the bases of the operation).

Aha, that's what in Scheme we call "combinations".

However, Wittgenstein needs to define what an "operation" is.

5.2.2. 5.22 An operation is the expression of a relation between the structures of its result and of its bases.

Again, that's easy, but not. (and elementary-proposition-1 elementary-proposition-2) is an expression, with two elementary propositions, and rule, wrapped into a combination.

5.2.3. 5.23 The operation is what has to be done to the one proposition in order to make the other out of it.

Again, this is obvious, but informal.

5.231 And that will, of course, depend on their formal properties, on the internal similarity of their forms.

The result, I guess? Because currently, in this form, we see no obvious dependency on the internal similarity.
5.232 The internal relation by which a series is ordered is equivalent to the operation that produces one term from another.

Are these sequences always expressible logically unambiguously? What about those aperiodic tilings… uncomputable ones?

I think this claim is wrong.
5.233 Operations cannot make their appearance before the point at which one proposition is generated out of another in a logically meaningful way; i.e. the point at which the logical construction of propositions begins.

Does it mean that operations are only defined "by example"?

I guess, "logically meaningful" is the important here.

Perhaps, that is why syntactic structures in Scheme are only accessible compile-time?
5.234 Truth-functions of elementary propositions are results of operations with elementary propositions as bases. (These operations I call truth-operations.)

Boolean functions, basically?
1. 5.2341 The sense of a truth-function of p is a function of the sense of p. Negation, logical addition, logical multiplication, etc. etc. are operations. (Negation reverses the sense of a proposition.)
  
  And those have to be "basic" operations, I guess.

5.2.4. 5.24 An operation manifests itself in a variable; it shows how we can get from one form of proposition to another. It gives expression to the difference between the forms. (And what the bases of an operation and its result have in common is just the bases themselves.)

I think that "manifests" here means, either "can be assigned to a variable", or "it's value given input can be assigned to a variable".

5.241 An operation is not the mark of a form, but only of a difference between forms.

Seems like he's saying that the "operation" is not inside the input, but rather an external thing, from the domain of "logic" (or "computation")?
5.242 The operation that produces ‘q’ from ‘p’ also produces ‘r’ from ‘q’, and so on. There is only one way of expressing this: ‘p’, ‘q’, ‘r’, etc. have to be variables that give expression in a general way to certain formal relations.

Ow… maybe he's actually struggling with recursion here?

His obsession with "variables" in this chapter comes from an inability to distinguish "eval" from "substitute"?

5.2.5. 5.25 The occurrence of an operation does not characterize the sense of a proposition. Indeed, no statement is made by an operation, but only by its result, and this depends on the bases of the operation. (Operations and functions must not be confused with each other.)

Again, he seems to be struggling to properly distinguish "functions", "macros", and "primitives".

5.251 A function cannot be its own argument, whereas an operation can take one of its own results as its base.

Ha! A function cannot be its own argument. I think that the logical system constructed by Wittgenstein is actually weaker than Lisp.

And he still hasn't properly defined "functions".
5.252 It is only in this way that the step from one term of a series of forms to another is possible (from one type to another in the hierarchies of Russell and Whitehead). (Russell and Whitehead did not admit the possibility of such steps, but repeatedly availed themselves of it.)

So an "operation" is like a macro mixed with application. For him, functions cannot call themselves, but operations can iteratively apply functions to their own outputs.
1. 5.2521 If an operation is applied repeatedly to its own results, I speak of successive applications of it. (‘O’O’O’a’ is the result of three successive applications of the operation ‘O’ ξ ’ to ‘a’.) In a similar sense I speak of successive applications of more than one operation to a number of propositions.
  
  What is ξ here?
2. 5.2522 Accordingly I use the sign ‘[a, x, O’x]’ for the general term of the series of forms a, O’a, O’O’a, . . . .This bracketed expression is a variable: the first term of the bracketed expression is the beginning of the series of forms, the second is the form of a term x arbitrarily selected from the series, and the third is the form of the term that immediately follows x in the series.
  
  So, he desperately needs looping constructions in his language, and is trying to invent notation for them.
3. 5.2523 The concept of successive applications of an operation is equivalent to the concept ‘and so on’.
  
  Yeah, I guess, looping has been boggling scientists for a long time.
5.253 One operation can counteract the effect of another. Operations can cancel one another.

Well, if information is not lost. Although, I guess, if your machine time is cheap, you can recompute everything from scratch. This is basically, in the worst case, backtracking.
5.254 An operation can vanish (e.g. negation in ‘~~p’: ~~p = p).

He has to make not an operation, because his functions are somehow dysfunctional, I guess.

5.3. 5.3 All propositions are results of truth-operations on elementary propositions. A truth-operation is the way in which a truth-function is produced out of elementary propositions. It is of the essence of truth-operations that, just as elementary propositions yield a truth-function of themselves, so too in the same way truth-functions yield a further truth-function. When a truth-operation is applied to truth-functions of elementary propositions, it always generates another truth-function of elementary propositions, another proposition. When a truth-operation is applied to the results of truth-operations on elementary propositions, there is always a single operation on elementary propositions that has the same result. Every proposition is the result of truth-operations on elementary propositions.

Again, "truth-operations" are intertwined with "truth-functions".

Okay, so for him, functions are "substitutable", not "evaluatable". But operations are always eager. For primitives, it is irrelevant whether they are lazy or eager, because they are essentially bits.

5.3.1. 5.31 The schemata in 4.31 have a meaning even when ‘p’, ‘q’, ‘r’, etc. are not elementary propositions. And it is easy to see that the propositional sign in 4.442 expresses a single truth-function of elementary propositions even when ‘p’ and ‘q’ are truth-functions of elementary propositions.

5.3.2. 5.32 All truth-functions are results of successive applications to elementary propositions of a finite number of truth-operations.

I think, there is some trouble with undecidability here.

5.4. 5.4 At this point it becomes manifest that there are no ‘logical objects’ or ‘logical constants’ (in Frege’s and Russell’s sense).

What is this Frege and Russel sense? Perhaps, he wants to say that there is no need in such a thing as "logical object", if it can be expressed in functions and operations.

5.4.1. 5.41 The reason is that the results of truth-operations on truth-functions are always identical whenever they are one and the same truth-function of elementary propositions.

Didn't he himself objected to the idea of "same" functions?

5.4.2. 5.42 It is self-evident that v, ⊃, etc. are not relations in the sense in which right and left etc. are relations. The interdefinability of Frege’s and Russell’s ‘primitive signs’ of logic is enough to show that they are not primitive signs, still less signs for relations. And it is obvious that the ‘⊃’ defined by means of ‘~’ and ‘v’ is identical with the one that figures with ‘∼’ in the definition of ‘v’; and that the second ‘v’ is identical with the first one; and so on.

Some polemic with other language builders. How familiar.

"We can express this in that, therefore this is not fundamental. Get lost, you, Common Lisp people."

5.4.3. 5.43 Even at first sight it seems scarcely credible that there should follow from one fact p infinitely many others, namely p, ~~p, etc. And it is no less remarkable that the infinite number of propositions of logic (mathematics) follow from half a dozen ‘primitive propositions’. But in fact all the propositions of logic say the same thing, to wit nothing.

For them at the time it is still not obvious that everything is basically one huge array of XORs.

The problem is usually not infinity, but rather fighting infinity.

5.4.4. 5.44 Truth-functions are not material functions. For example, an affirmation can be produced by double negation: in such a case does it follow that in some sense negation is contained in affirmation? Does ‘~~p’ negate ~p, or does it affirm p — or both? The proposition ‘~~p’ is not about negation, as if negation were an object: on the other hand, the possibility of negation is already written into affirmation. And if there were an object called ‘~’, it would follow that ‘~~p’ said something different from what ‘p’ said, just because the one proposition would then be about ~ and the other would not.

Again, evaluation is boggling him.

5.441 This vanishing of the apparent logical constants also occurs in the case of ‘~(∃x).~fx’, which says the same as ‘(x).fx’, and in the case of ‘(∃x).fx.x = a’, which says the same as ‘fa’.

Not sure I understand this example. He suggests reducing everything? Not always works.
5.442 If we are given a proposition, then with it we are also given the results of all truth-operations that have it as their base.

I guess. Shall we use them for precomputation? all is usually unmanageable.

No, I do not understand. Those "all truth-operations" may need to evaluate other propositions.

5.4.5. 5.45 If there are primitive logical signs, then any logic that fails to show clearly how they are placed relatively to one another and to justify their existence will be incorrect. The construction of logic out of its primitive signs must be made clear.

I guess, this means that evaluation should have proper semantic, and primitives of the language must be described somewhere, in a language standard.

5.451 If logic has primitive ideas, they must be independent of one another. If a primitive idea has been introduced, it must have been introduced in all the combinations in which it ever occurs. It cannot, therefore, be introduced first for one combination and later re-introduced for another. For example, once negation has been introduced, we must understand it both in propositions of the form ‘~p’ and in propositions like ‘~(p v q)’, ‘(∃x).~fx’, etc. We must not introduce it first for the one class of cases and then for the other, since it would then be left in doubt whether its meaning were the same in both cases, and no reason would have been given for combining the signs in the same way in both cases. (In short, Frege’s remarks about introducing signs by means of definitions (in The Fundamental Laws of Arithmetic) also apply, mutatis mutandis, to the introduction of primitive signs.)

Well, even in Scheme we have different versions of begin. Other languages are even worse in terms of defining primitives.
5.452 The introduction of any new device into the symbolism of logic is necessarily a momentous event. In logic a new device should not be introduced in brackets or in a footnote with what one might call a completely innocent air. (Thus in Russell and Whitehead’s Principia Mathematica there occur definitions and primitive propositions expressed in words. Why this sudden appearance of words? It would require a justification, but none is given, or could be given, since the procedure is in fact illicit.) But if the introduction of a new device has proved necessary at a certain point, we must immediately ask ourselves, ‘At what points is the employment of this device now unavoidable?’ and its place in logic must be made clear.

This "device" is clearly "introduction of new language features".

"Programming languages should be designed not by piling feature on top of feature, but by removing the weaknesses and restrictions that make additional features appear necessary."
5.453 All numbers in logic stand in need of justification. Or rather, it must become evident that there are no numbers in logic. There are no pre-eminent numbers.

I guess, in the same way as Wittgenstein dislikes truth constants (I think, he prefers having 1 and 0 instead. And a (false? ) predicate), he dislikes numbers as primitives. Russel has 0 as a basic, number, I think, and a +1 operation. Church encoding lets you live without even 0.
5.454 In logic there is no co-ordinate status, and there can be no classification. In logic there can be no distinction between the general and the specific.

Is this again a reference to reasoning over types instead of instances?
1. 5.4541 The solutions of the problems of logic must be simple, since they set the standard of simplicity. Men have always had a presentiment that there must be a realm in which the answers to questions are symmetrically combined —a priori— to form a self-contained system. A realm subject to the law: Simplex sigillum veri.
  
  I think that this is an emotional statement akin to the ones programmers call KISS. (Keep It Simple, Sir)

5.4.6. 5.46 If we introduced logical signs properly, then we should also have introduced at the same time the sense of all combinations of them; i.e. not only ‘p v q’ but ‘~(p v ~q)’ as well, etc. etc. We should also have introduced at the same time the effect of all possible combinations of brackets. And thus it would have been made clear that the real general primitive signs are not ‘p v q’, ‘(∃x).fx’, etc. but the most general form of their combinations.

Yes, but how is he going to introduce this "most general combination"? By a generative grammar?

5.461 Though it seems unimportant, it is in fact significant that the pseudo-relations of logic, such as v and ⊃, need brackets — unlike real relations. Indeed, the use of brackets with these apparently primitive signs is itself an indication that they are not the real primitive signs. And surely no one is going to believe that brackets have an independent meaning.

Hehe, in Scheme everything is used in sexps. Homoiconicity eludes him.
1. 5.4611 Signs for logical operations are punctuation-marks.
  
  Nice metaphor, but incomplete. Indeed, in human languages, punctuation marks are used for many "operator-like" purposes, but I do not think that logical operators are from the same category. Maybe brackets are.

5.4.7. 5.47 It is clear that whatever we can say in advance about the form of all propositions, we must be able to say all at once. An elementary proposition really contains all logical operations in itself. For ‘fa’ says the same thing as ‘(∃x).fx.x = a’. Wherever there is compositeness, argument and function are present, and where these are present, we already have all the logical constants. One could say that the sole logical constant was what all propositions, by their very nature, had in common with one another. But that is the general propositional form.

We need to define "the form of all propositions" before doing any reasoning. Whether this "form of all propositions" is formal syntax or formal semantic, I am not sure.

I think that he still needs #f as his logical constant. On the other hand, \(A \land ~A\) is false, so maybe not even that. All the other propositions can be combinations of elementary an primitive operations.

5.471 The general propositional form is the essence of a proposition.

Like, being combined according to the laws of the language.
1. 5.4711 To give the essence of a proposition means to give the essence of all description, and thus the essence of the world.
  
  So, the world is a program.
5.472 The description of the most general propositional form is the description of the one and only general primitive sign in logic.

That is the law of combination of propositions?
5.473 Logic must look after itself. If a sign is possible, then it is also capable of signifying. Whatever is possible in logic is also permitted. (The reason why ‘Socrates is identical’ means nothing is that there is no property called ‘identical’. The proposition is nonsensical because we have failed to make an arbitrary determination, and not because the symbol, in itself, would be illegitimate.) In a certain sense, we cannot make mistakes in logic.

This seems like a difference between a compile-time (or read-time) and run-time error. "Socrates is identical" is a run-time error, but not a compile-time error.
1. 5.4731 Self-evidence, which Russell talked about so much, can become dispensable in logic, only because language itself prevents every logical mistake. — What makes logic a priori is the impossibility of illogical thought.
  
  I think that elementary propositions still have to be self-evident.
2. 5.4732 We cannot give a sign the wrong sense.
  
  Because we cannot give a sign any sense other than it has according to the laws of logic it is written in. But we can feed a Scheme program into a Common Lisp interpreter and observe all kinds of errors.
  1. 5.47321 Occam’s maxim is, of course, not an arbitrary rule, nor one that is justified by its success in practice: its point is that unnecessary units in a sign-language mean nothing. Signs that serve one purpose are logically equivalent, and signs that serve none are logically meaningless.
    
    He is ignoring performance considerations, again!
3. 5.4733 Frege says that any legitimately constructed proposition must have a sense. And I say that any possible proposition is legitimately constructed, and, if it has no sense, that can only be because we have failed to give a meaning to some of its constituents. (Even if we think that we have done so.) Thus the reason why ‘Socrates is identical’ says nothing is that we have not given any adjectival meaning to the word ‘identical’. For when it appears as a sign for identity, it symbolizes in an entirely different way — the signifying relation is a different one — therefore the symbols also are entirely different in the two cases: the two symbols have only the sign in common, and that is an accident.
  
  An example of confusion, I guess? One of "identical" is a symbol that has to resolve to something.
  
  The other, I guess, has to be, in Wittgenstein's words, an "operation", that maps Socrates to Socrates, or Socrates to the value of Socrates.
5.474 The number of fundamental operations that are necessary depends solely on our notation.

I guess, the number cannot be 0. But with XOR we should be able to do anything.
5.475 All that is required is that we should construct a system of signs with a particular number of dimensions — with a particular mathematical multiplicity.

Unclear. Does by "multiplicity" he mean "being able to appear as argument in a proposition"?
5.476 It is clear that this is not a question of a number of primitive ideas that have to be signified, but rather of the expression of a rule.

A question of language design, essentially.

5.5. 5.5 Every truth-function is a result of successive applications to elementary propositions of the operation ‘(–—T)( ξ , . . . .)’. This operation negates all the propositions in the right-hand pair of brackets, and I call it the negation of those propositions.

Is he actually trying to introduce the Horn rule here?

Either any of the (not sub-proposition) is T, or the proposition is T.

5.501 When a bracketed expression has propositions as its terms — and the order of the terms inside the brackets is indifferent — then I indicate it by a sign of the form ‘(ξ)’. ‘ ξ’ is a variable whose values are terms of the bracketed expression and the bar over the variable [not implemented] indicates that it is the representative of all its values in the brackets. (E.g. if ξ has the three values P, Q, R, then (ξ) = (P, Q, R).) What the values of the variable are is something that is stipulated. The stipulation is a description of the propositions that have the variable as their representative. How the description of the terms of the bracketed expression is produced is not essential. We can distinguish three kinds of description: 1. direct enumeration, in which case we can simply substitute for the variable the constants that are its values; 2. giving a function fx whose values for all values of x are the propositions to be described; 3. giving a formal law that governs the construction of the propositions, in which case the bracketed expression has as its members all the terms of a series of forms.
So now Wittgenstein is trying to introduce list processing.
1. is ordinary lists
2. is eager streams
3. is lazy streams
Looks plausible? What about those ugly uncomputable sequences?
5.502 So instead of ‘(–—T)( ξ , . . . .)’, I write ‘N((ξ))’. N((ξ)) is the negation of all the values of the propositional variable ξ .

That's just notation?
1. 5.503 It is obvious that we can easily express how propositions may be constructed with this operation, and how they may not be constructed with it; so it must be possible to find an exact expression for this.
  
  So he's trying to find an exact expression for this "list negation operation"?

5.5.1. 5.51 If (ξ) has only one value, then N( ξ ) = ~p (not p); if it has two values, then N((ξ)) = ~p.~q (neither p nor q).

Again, he seems to be writing an explanation of the Horn rule. See https://en.wikipedia.org/wiki/Horn_clause

5.511 How can logic — all-embracing logic, which mirrors the world — use such peculiar crotchets and contrivances? Only because they are all connected with one another in an infinitely fine network, the great mirror.

Emotional clause. I think that it is not "the world" that is using crotchets and contrivances, it is our human brain, especially its analytic part (that is not unlike a turing machine).
1. 5.512 ‘~p’ is true if ‘p’ is false. Therefore, in the proposition ‘~p’, when it is true, ‘p’ is a false proposition. How then can the stroke ‘~’ make it agree with reality? But in ‘~p’ it is not ‘~’ that negates; it is rather what is common to all the signs of this notation that negate p. That is to say the common rule that governs the construction of ‘~p’, ‘~~~p’, ‘~p v ~p’, ‘~p.~p’, etc. etc. (ad inf.). And this common factor mirrors negation.
  
  Hm… I think there is a confusion between the universal rules and the concrete values.
  
  He certainly struggles with understanding something here.
  
  We cannot just substitute the value of p into the expression. We actually need to evaluate (not p) here.
5.513 We might say that what is common to all symbols that affirm both p and q is the proposition ‘p.q’; and that what is common to all symbols that affirm either p or q is the proposition ‘p v q’. And similarly we can say that two propositions are opposed to one another if they have nothing in common with one another, and that every proposition has only one negative, since there is only one proposition that lies completely outside it. Thus in Russell’s notation too it is manifest that ‘q:p v ~p’ says the same thing as ‘q’, that ‘p v ~p’ says nothing.

This is still speaking about substitution vs evaluation. And hence, reduction/optimisation in the substitution case.

There should be some way of rewriting each function that returns 1 only when p and q are both 1 using (and p q), and never using p or q directly.

Perhaps, this is also an intrduction to the idea of abstraction.
5.514 Once a notation has been established, there will be in it a rule governing the construction of all propositions that negate p, a rule governing the construction of all propositions that affirm p, and a rule governing the construction of all propositions that affirm p or q; and so on. These rules are equivalent to the symbols; and in them their sense is mirrored.

Equivalent to symbols… I think, in Scheme-speak it will mean that "rules" (whatever that is) will be indistinguishable from "variables" that can be resolved. But variables will resolve to values, and rules will resolve to combinations.
5.515 It must be manifest in our symbols that it can only be propositions that are combined with one another by ‘v’, ‘.’, etc. And this is indeed the case, since the symbol in ‘p’ and ‘q’ itself presupposes ‘v’, ‘~’, etc. If the sign ‘p’ in ‘p v q’ does not stand for a complex sign, then it cannot have sense by itself: but in that case the signs ‘p v p’, ‘p.p’, etc., which have the same sense as p, must also lack sense. But if ‘p v p’ has no sense, then ‘p v q’ cannot have a sense either.

I think that what he is trying to say here is that logical operations should be closed, that is should be able to be combined indefinitely.
1. 5.5151 Must the sign of a negative proposition be constructed with that of the positive proposition? Why should it not be possible to express a negative proposition by means of a negative fact? (E.g. suppose that ‘a’ does not stand in a certain relation to ‘b’; then this might be used to say that aRb was not the case.) But really even in this case the negative proposition is constructed by an indirect use of the positive. The positive proposition necessarily presupposes the existence of the negative proposition and vice versa.
  
  So, bits can be flipped, just that.

5.5.2. 5.52 If ξ has as its values all the values of a function fx for all values of x, then N((ξ)) = ~(∃x).fx.

5.521 I dissociate the concept all from truth-functions. Frege and Russell introduced generality in association with logical product or logical sum. This made it difficult to understand the propositions ‘(∃x).fx’ and ‘(x).fx’, in which both ideas are embedded.

For him, generality is not a part of the logical system.
5.522 What is peculiar to the generality-sign is first, that it indicates a logical prototype, and secondly, that it gives prominence to constants.

What is a "logical prototype"? Why would it emphasise constants?
5.523 The generality-sign occurs as an argument.

To a function? Or in a logical derivation?
5.524 If objects are given, then at the same time we are given all objects. If elementary propositions are given, then at the same time all elementary propositions are given.

That's clear. Input cannot grow.
5.525 It is incorrect to render the proposition ‘(∃x).fx’ in the words, ‘fx is ~possible~’, as Russell does. The certainty, possibility, or impossibility of a situation is not expressed by a proposition, but by an expression’s being a tautology, a proposition with sense, or a contradiction. The precedent to which we are constantly inclined to appeal must reside in the symbol itself.

There is something important here that I do not understand… So, instead of "X is possible", he wants to say "X has sense"? That is, instead of "there is some argument on which X gives a correct answer", he wants universality over input?
5.526 We can describe the world completely by means of fully generalized propositions, i.e. without first correlating any name with a particular object. Then, in order to arrive at the customary mode of expression, we simply need to add, after an expression like, ‘There is one and only one x such that . . .’, the words, ‘and that x is a’.

So, still, he insists that the logical system should be complete. Then substituting the initial condition, we should get the actual trajectory in the state space.

I think that later logicians prove this to be impossible. But for a closed world, let it be.

Still, this reminds me about the fact that we can do some fun stuff with no data, only programming the machine with a tiny seed. Mandelbrot and stuff.
1. 5.5261 A fully generalized proposition, like every other proposition, is composite. (This is shown by the fact that in ‘(∃x,φ).φx’ we have to mention ‘ φ ’ and ‘x’ separately. They both, independently, stand in signifying relations to the world, just as is the case in ungeneralized propositions.) It is a mark of a composite symbol that it has something in common with other symbols.
  
  I guess… unless you have no input…
2. 5.5262 The truth or falsity of every proposition does make some alteration in the general construction of the world. And the range that the totality of elementary propositions leaves open for its construction is exactly the same as that which is delimited by entirely general propositions. (If an elementary proposition is true, that means, at any rate, one more true elementary proposition.)
  
  Entirely general propositions… I guess, means propositions that make sense on every input. The most general… should depend on all of the input, I guess?
  
  So one bit changed in the input results in the total change, which can be reflected in the behaviour of the function that uses all of the input.

5.5.3. 5.53 Identity of object I express by identity of sign, and not by using a sign for identity. Difference of objects I express by difference of signs.

How does this correlate with his discussion of equality?

4.24, 4.241 and such?

5.5301 It is self-evident that identity is not a relation between objects. This becomes very clear if one considers, for example, the proposition ‘(x):fx.⊃.x = a’. What this proposition says is simply that only a satisfies the function f, and not that only things that have a certain relation to a satisfy the function f. Of course, it might then be said that only a did have this relation to a; but in order to express that, we should need the identity-sign itself.

And? Yes, a complex logical expression is optimisable. But optimisation is not free.
5.5302 Russell’s definition of ‘=’ is inadequate, because according to it we cannot say that two objects have all their properties in common. (Even if this proposition is never correct, it still has sense.)

This is, again, the difference between eq? and equal?.
5.5303 Roughly speaking, to say of two things that they are identical is nonsense, and to say of one thing that it is identical with itself is to say nothing at all.

I think that this is his big mistake. Maybe, when you are modelling the world, you need to make sure that all things that are equal? are also eq?, but you can do a lot of things even if that condition is not satisfied.

5.531 Thus I do not write ‘f(a,b).a = b’, but ‘f(a,a)’ (or ‘f(b,b)’); and not f(a,b).∼a = b’, but ‘f(a,b)’.

Again, he thinks that this clarifies things, but that it so unessential to philosophy.

Yeah, yeah, programming style should be good. Don't make useless variables.

But now we do not consider that terribly important because we machines can spot a lot of errors like this.

5.532 And analogously I do not write ‘(∃x,y).f(x,y).x = y’, but ‘(∃x).f(x,x)’; and not ‘(∃x,y).f(x,y).~x = y’, but ‘(∃x.y).f(x,y)’. (So Russell’s ‘(∃x,y).fxy’ becomes ‘(∃x.y).f(x,y).v.(∃x).f(x,x)’.)

Seems that Wittgenstein gets really annoyed by extra notation.

How would he solve cases when he does not, at first, know that x=y, but that is actually derivable?

5.5321 Thus, for example, instead of ‘(x):fx ⊃ x = a’ we write ‘(∃x).fx.⊃.fa: (∃x,y).fx.fy’. And the proposition, ‘~Only one x satisfies f( )’, will read ‘(∃x).fx: ~(∃x,y).fx.fy’.

Okay.

5.533 The identity-sign, therefore, is not an essential constituent of conceptual notation.

So, he has derived this particular logic of his, in which it is possible to avoid using equality in exchange for the manipulation with quantifiers.

In 2021, I think we would have consider this a defeat, rather than an achievement.

Working with quantifiers is a pain, working with equality is a bliss.

5.534 And now we see that in a correct conceptual notation pseudo-propositions like ‘a = a’, ‘a = b.b = c.⊃ a = c’, ‘(x).x = x’, ‘(∃x).x = a’, etc. cannot even be written down.

It is interesting idea that you "cannot write meaningless code".

5.535 This also disposes of all the problems that were connected with such pseudo-propositions. All the problems that Russell’s ‘axiom of infinity’ brings with it can be solved at this point. What the axiom of infinity is intended to say would express itself in language through the existence of infinitely many names with different meanings.

This is the first time he actually refers to a concrete place in the works of his predecessors.

The "axiom of infinity" is defined at 120.30 or something like that, in the second volume of Principia Mathematica.

In any case, it would have been amazing to work in a world where distinct things are by default always distinct in description, but that is, sadly, almost never the case.

5.5351 There are certain cases in which one is tempted to use expressions of the form ‘a = a’ or ‘p ⊃ p’ and the like. In fact, this happens when one wants to talk about prototypes, e.g. about proposition, thing, etc. Thus in Russell’s Principles of Mathematics ‘p is a proposition’ —which is nonsense— was given the symbolic rendering ‘p ⊃ p’ and placed as an hypothesis in front of certain propositions in order to exclude from their argument-places everything but propositions. (It is nonsense to place the hypothesis ‘p ⊃ p’ in front of a proposition, in order to ensure that its arguments shall have the right form, if only because with a non-proposition as argument the hypothesis becomes not false but nonsensical, and because arguments of the wrong kind make the proposition itself nonsensical, so that it preserves itself from wrong arguments just as well, or as badly, as the hypothesis without sense that was appended for that purpose.)

So, Wittgenstein insists that propositions, and, perhaps, other logical constructs cannot be reasoned about by the system they are parts of.

This sounds both good and bad. Good, because I always found just wrong the statements similar to the 'Godelian proposition'. (That one that has a number, and states that the statement with that number is not provable.) This sounded wrong, because interpreters should not be able to reason about themselves.

On the other hand, why not? There are virtual machines, there are metacircular interpreters. Why not, after all?
5.5352 In the same way people have wanted to express, ‘There are no ~things~’, by writing ‘~(∃x).x = x’. But even if this were a proposition, would it not be equally true if in fact ‘there were things’ but they were not identical with themselves?

This clause is interesting because it uses ‘~(∃x).x = x’. Indeed, I see what Wittgenstein was annoyed about.

5.5.4. 5.54 In the general propositional form propositions occur in other propositions only as bases of truth-operations.

Because in his world, truth-operations evaluate.

5.541 At first sight it looks as if it were also possible for one proposition to occur in another in a different way. Particularly with certain forms of proposition in psychology, such as ‘A believes that p is the case’ and ‘A has the thought p’, etc. For if these are considered superficially, it looks as if the proposition p stood in some kind of relation to an object A. (And in modern theory of knowledge (Russell, Moore, etc.) these propositions have actually been construed in this way.)

Ah, ok. He seems to be repeating his premise that relationships other than binary are not useful. Ok, now that in computers our world is binary, this seems obvious.
5.542 It is clear, however, that ‘A believes that p’, ‘A has the thought p’, and ‘A says p’ are of the form ‘“p” says p’: and this does not involve a correlation of a fact with an object, but rather the correlation of facts by means of the correlation of their objects.

Basically, this is a call for an introduction of more constructions into the language itself.
1. 5.5421 This shows too that there is no such thing as the soul —the subject, etc.— as it is conceived in the superficial psychology of the present day. Indeed a composite soul would no longer be a soul.
  
  This requires a bit of thinking, but essentially, means that since everything consists of primitives, and people consist of molecules, and their behaviour is computable, there is no such a thing that "makes decisions".
  
  Maybe, under "superficial psychology" he means "free will". If everything is computable, I guess, you can say that there is no free will.
2. 5.5422 The correct explanation of the form of the proposition, ‘A makes the judgement p’, must show that it is impossible for a judgement to be a piece of nonsense. (Russell’s theory does not satisfy this requirement.)
  
  A compiler should not compile senseless things. I guess, Wittgenstein would like Haskell.
3. 5.5423 To perceive a complex means to perceive that its constituents are related to one another in such and such a way. This no doubt also explains why there are two possible ways of seeing the figure as a cube; and all similar phenomena. For we really see two different facts. (If I look in the first place at the corners marked a and only glance at the b’s, then the a’s appear to be in front, and vice versa).
  
  I guess, a computing alternative would be a piece of code that is valid both as Scheme, and as Common Lisp.

5.5.5. 5.55 We now have to answer a priori the question about all the possible forms of elementary propositions. Elementary propositions consist of names. Since, however, we are unable to give the number of names with different meanings, we are also unable to give the composition of elementary propositions.

Elementary propositions consist of names. Without loss of generality, just two names, 1 and 0.

Cannot give the composition of elementary propositions? I cannot understand.

5.551 Our fundamental principle is that whenever a question can be decided by logic at all it must be possible to decide it without more ado. (And if we get into a position where we have to look at the world for an answer to such a problem, that shows that we are on a completely wrong track.)

This is very practical and should be compulsory to read for all physicists.
5.552 The ‘experience’ that we need in order to understand logic is not that something or other is the state of things, but that something is: that, however, is not an experience. Logic is prior to every experience — that something is so. It is prior to the question ‘How?’, not prior to the question ‘What?’

My English parser got broken here. Indeed, the replacement of "what?" with "how?" is quite a step forward in terms of theory of knowledge.

"What?", I guess, should be defined by philosophy, not logic.
1. 5.5521 And if this were not so, how could we apply logic? We might put it in this way: if there would be a logic even if there were no world, how then could there be a logic given that there is a world?
  
  Well, you can do all sort of fun logical games, such as plotting the Mandelbrot set, without any input.
5.553 Russell said that there were simple relations between different numbers of things (individuals). But between what numbers? And how is this supposed to be decided?—By experience? (There is no pre-eminent number.)

No, I cannot understand. Indeed, numbers are not fundamental, as Russell proves. So, relations between objects are not fundamental too?
5.554 It would be completely arbitrary to give any specific form.

Does that mean that there should be an algorithm to generate random valid code?
1. 5.5541 It is supposed to be possible to answer a priori the question whether I can get into a position in which I need the sign for a 27-termed relation in order to signify something.
  
  Does he imply the need for lower bounds? At least in the length of the source code?
  
  In general, there are various methods of finding lower bounds.
2. 5.5542 But is it really legitimate even to ask such a question? Can we set up a form of sign without knowing whether anything can correspond to it? Does it make sense to ask what there must be in order that something can be the case?
  
  Why not? Syntax checkers are not new. Static/dynamic analysis tools are not new.
5.555 Clearly we have some concept of elementary propositions quite apart from their particular logical forms. But when there is a system by which we can create symbols, the system is what is important for logic and not the individual symbols. And anyway, is it really possible that in logic I should have to deal with forms that I can invent? What I have to deal with must be that which makes it possible for me to invent them.

That "insight"? Indeed, he is already thinking about an algorithm for generating true statements.
5.556 There cannot be a hierarchy of the forms of elementary propositions. We can foresee only what we ourselves construct.

Input bits are all equal in rights.
1. 5.5561 Empirical reality is limited by the totality of objects. The limit also makes itself manifest in the totality of elementary propositions. Hierarchies are and must be independent of reality.
  
  The first and the second statement are true. The third does not seem to follow from them, but I can see why it should be the case.
2. 5.5562 If we know on purely logical grounds that there must be elementary propositions, then everyone who understands propositions in their unanalysed form must know it.
  
  Well, everyone who understands logic, I guess. This kinda implies the ability to implement an algorithm in any language.
3. 5.5563 In fact, all the propositions of our everyday language, just as they stand, are in perfect logical order. — That utterly simple thing, which we have to formulate here, is not a likeness of the truth, but the truth itself in its entirety. (Our problems are not abstract, but perhaps the most concrete that there are.)
  
  No, wait, people are very capable of creating nonsensical sentences.
  
  Everyday life is a very complex thing actually, to model.
5.557 The application of logic decides what elementary propositions there are. What belongs to its application, logic cannot anticipate. It is clear that logic must not clash with its application. But logic has to be in contact with its application. Therefore logic and its application must not overlap.

I think this is an emotional stance against breaking abstraction barriers.
1. 5.5571 If I cannot say a priori what elementary propositions there are, then the attempt to do so must lead to obvious nonsense.
  
  Is this a working method of thinking? Announce facts, and try to reason whether they are meaningful?

5.6. 5.6 `The limits of my language` mean the limits of my world.

Presumably, we are thinking with language. But let us imagine a person who used to have the sense of smell, and then lost it. He still remembers what the smell is, but it is not longer in his world.

5.6.1. 5.61 Logic pervades the world: the limits of the world are also its limits. So we cannot say in logic, ‘The world has this in it, and this, but not that.’ For that would appear to presuppose that we were excluding certain possibilities, and this cannot be the case, since it would require that logic should go beyond the limits of the world; for only in that way could it view those limits from the other side as well. We cannot think what we cannot think; so what we cannot think we cannot say either.

What we cannot digitise, we cannot produce any program about.

5.6.2. 5.62 This remark provides the key to the problem, how much truth there is in solipsism. For what the solipsist `means` is quite correct; only it cannot be `said`, but makes itself manifest. The world is my world: this is manifest in the fact that the limits of `language` (of that language which alone I understand) mean the limits of my world.

Solipsism cannot be "articulated", because it itself presupposes the fact that the words of the articulation will not be transmitted anywhere.

5.621 The world and life are one.

Computo ergo sum.

5.6.3. 5.63 I am my world. (The microcosm.)

Solipsism.

5.631 There is no such thing as the subject that thinks or entertains ideas. If I wrote a book called The World as I found it, I should have to include a report on my body, and should have to say which parts were subordinate to my will, and which were not, etc., this being a method of isolating the subject, or rather of showing that in an important sense there is no subject; for it alone could not be mentioned in that book.—

In fact, this is how good texts are written! A writer gives an account of himself first, in order to let the readers understand from which viewpoint he is writing.

I remember that there used to be a similar tradition in the common law court procedures. Explain yourself first.
5.632 The subject does not belong to the world: rather, it is a limit of the world.

A computer.
5.633 Where in the world is a metaphysical subject to be found? You will say that this is exactly like the case of the eye and the visual field. But really you do not see the eye. And nothing in the visual field allows you to infer that it is seen by an eye.

Well, in a computer we have a lot of things to query the underlying system. cpuid, cpuinfo, just reading the interpreter's memory.
1. 5.6331 For the form of the visual field is surely not like this @
  
  (Wittgenstein here has a sketch of a potential visual field.)
5.634 This is connected with the fact that no part of our experience is at the same time a priori. Whatever we see could be other than it is. Whatever we can describe at all could be other than it is. There is no a priori order of things.

So, the input that we expect to be a picture of a chair may actually be a wavefront on the camera sensor. We just do not know.

5.6.4. 5.64 Here it can be seen that solipsism, when its implications are followed out strictly, coincides with pure realism. The self of solipsism shrinks to a point without extension, and there remains the reality co-ordinated with it.

Okay, so he is actually a solipsist.

5.641 Thus there really is a sense in which philosophy can talk about the self in a non-psychological way. What brings the self into philosophy is the fact that ‘the world is my world’. The philosophical self is not the human being, not the human body, or the human soul, with which psychology deals, but rather the metaphysical subject, the limit of the world — not a part of it.

So, "self" in philosophy is a model of the interpreter that is running our simulation.

6. 6 The general form of a truth-function is [(p), (ξ), N((ξ))]. This is the general form of a proposition.

All possible p (inputs), all possible ξ (propositions), all applications of the Horn rule, which should tell us whether non-elementary propositions are derivable?

6.001 What this says is just that every proposition is a result of successive applications to elementary propositions of the operation N((ξ)).

Yah, seems like everything should be one giant Prolog interpreter.

Also, ξ is off colossal size.
6.002 If we are given the general form according to which propositions are constructed, then with it we are also given the general form according to which one proposition can be generated out of another by means of an operation.

So we first generate all possible outputs.

6.0.1. 6.01 Therefore the general form of an operation Ω’((η)) is [ (ξ) , N((ξ))]’ ((η)) (= [(η),(ξ), N((ξ))]). This is the most general form of transition from one proposition to another.

The argument here is, I guess, all propositions (not necessarily elementary) which have already been proven.

6.0.2. 6.02 And this is how we arrive at numbers. I give the following definitions \(x = Ω^{0'}x Def., Ω’Ω^{ν'}x = Ω^{ν+1'}x Def. \) So, in accordance with these rules, which deal with signs, we write the series x, Ω’x, Ω’Ω’x, Ω’Ω’Ω’x, . . . , in the following way Ω^0’x, Ω⁰⁺¹’x, Ω⁰⁺¹⁺¹’x, Ω^0+1+1+1’x, … . Therefore, instead of ‘[x, ξ , Ω’ ξ ]’, I write ‘[ Ω^0’x, Ω^ν’x, Ω^ν+1’x]’. And I give the following definitions 0+1 = 1 Def., 0+1+1 = 2 Def., 0+1+1+1 = 3 Def., (and so on).

Again, this is kind of Church numerals.

6.021 A number is the exponent of an operation.

Yes, Church numeral.
6.022 The concept of number is simply what is common to all numbers, the general form of a number. The concept of number is the variable number. And the concept of numerical equality is the general form of all particular cases of numerical equality.

This is the idea we often see in later mathematics, although not that much in programming. "What behaves as X can be used as X".

6.0.3. 6.03 The general form of an integer is [0, ξ, ξ+1].

This +1, I guess, contains a lot of stuff under the carpet.

6.031 The theory of classes is completely superfluous in mathematics. This is connected with the fact that the generality required in mathematics is not accidental generality.

Does he mean the "theory of sets"? Or the distinction between classes and sets?

6.1. 6.1 The propositions of logic are tautologies.

Or 'laws of logic'.

6.1.1. 6.11 Therefore the propositions of logic say nothing. (They are the analytic propositions.)

But they can condense information from the input.

6.111 All theories that make a proposition of logic appear to have content are false. One might think, for example, that the words ‘true’ and ‘false’ signified two properties among other properties, and then it would seem to be a remarkable fact that every proposition possessed one of these properties. On this theory it seems to be anything but obvious, just as, for instance, the proposition, ‘All roses are either yellow or red’, would not sound obvious even if it were true. Indeed, the logical proposition acquires all the characteristics of a proposition of natural science and this is the sure sign that it has been construed wrongly.

Fortran has a bessel function as a primitive. Surely superfluous!
6.112 The correct explanation of the propositions of logic must assign to them a unique status among all propositions.

The status of "laws"?
6.113 It is the peculiar mark of logical propositions that one can recognize that they are true from the symbol alone, and this fact contains in itself the whole philosophy of logic. And so too it is a very important fact that the truth or falsity of non-logical propositions cannot be recognized from the propositions alone.

Well, make your programming languages consistent.

6.1.2. 6.12 The fact that the propositions of logic are tautologies shows the formal — logical — properties of language and the world. The fact that a tautology is yielded by this particular way of connecting its constituents characterizes the logic of its constituents. If propositions are to yield a tautology when they are connected in a certain way, they must have certain structural properties. So their yielding a tautology when combined in this way shows that they possess these structural properties.

And? Is this the property of a particular logic? Or logics in general? Or this world? Or human brain?

6.1201 For example, the fact that the propositions ‘p’ and ‘~p’ in the combination ‘~(p.~p)’ yield a tautology shows that they contradict one another. The fact that the propositions ‘p ⊃ q’, ‘p’, and ‘q’, combined with one another in the form ‘(p ⊃ q).(p):⊃:(q)’, yield a tautology shows that q follows from p and p⊃q. The fact that ‘(x).fx:⊃fa’ is a tautology shows that fa follows from (x).fx. Etc. etc.

So, laws of logic are tautologies in the logical notation. Is that the thing Wittgenstein want to say?

Again, we use tautologies to optimise the code.
6.1202 It is clear that one could achieve the same purpose by using contradictions instead of tautologies.

And we use contraditions to find bugs in code.
6.1203 In order to recognize an expression as a tautology, in cases where no generality-sign occurs in it, one can employ the following intuitive method: instead of ‘p’, ‘q’, ‘r’, etc. I write ‘TpF’, ‘TqF’, ‘TrF’, etc. Truth-combinations I express by means of brackets, e.g. @ and I use lines to express the correlation of the truth or falsity of the whole proposition with the truth-combinations of its truth-arguments, in the following way @ So this sign, for instance, would represent the proposition p ⊃ q. Now, by way of example, I wish to examine the proposition ~(p.~p) (the law of contradiction) in order to determine whether it is a tautology. In our notation the form ‘~ξ ’ is written as @ and the form ‘~ξ . η ’ as @ . Hence the proposition ~(p.~q) reads as follows @ If we here substitute ‘p’ for ‘q’ and examine how the outermost T and F are connected with the innermost ones, the result will be that the truth of the whole proposition is correlated with all the truth-combinations of its argument, and its falsity with none of the truth-combinations.

(For the graphs, see the printed edition of the Tractatus.)

He is trying to draw an evaluation graph for a tautology.

And, expectedly, getting that there is no path that is leading to the false value.

6.121 The propositions of logic demonstrate the logical properties of propositions by combining them so as to form propositions that say nothing. This method could also be called a zero-method. In a logical proposition, propositions are brought into equilibrium with one another, and the state of equilibrium then indicates what the logical constitution of these propositions must be.

It's kind of like a sketch of a method for the creation of new tautologies…

I am trying to think whether this is actually an inverse way of writing the Horn clause? Like, in the "Definite" form of it.

6.122 It follows from this that we can actually do without logical propositions; for in a suitable notation we can in fact recognize the formal properties of propositions by mere inspection of the propositions themselves.

But, I thing he still needs his 'truth-operations'.

6.1221 If, for example, two propositions ‘p’ and ‘q’ in the combination ‘p ⊃ q’ yield a tautology, then it is clear that q follows from p. For example, we see from the two propositions themselves that ‘q’ follows from ‘p ⊃ q.p’, but it is also possible to show it in this way: we combine them to form ‘p ⊃ q.p:⊃:q’, and then show that this is a tautology.

So, this seems to be an algorithisable rule. Is there an inference engine that supports this way of inference?
6.1222 This throws some light on the question why logical propositions cannot be confirmed by experience any more than they can be refuted by it. Not only must a proposition of logic be irrefutable by any possible experience, but it must also be unconfirmable by any possible experience.

Well, logical law must be true for all possible inputs. I guess, unconfirmable here can be seen as "you need to check it on all possible inputs".
6.1223 Now it becomes clear why people have often felt as if it were for us to ‘postulate’ the ‘truths of logic’. The reason is that we can postulate them in so far as we can postulate an adequate notation.

Not obvious to me. If it is all subjective and non-verifiable…
6.1224 It also becomes clear now why logic was called the theory of forms and of inference.

And programming is seen as a theory of writing code?

6.123 Clearly the laws of logic cannot in their turn be subject to laws of logic. (There is not, as Russell thought, a special law of contradiction for each ‘type’; one law is enough, since it is not applied to itself.)

Well, Godel's incompleteness theorem relies on the ability of applying the laws of logic to the laws of logic.

This 'cannot be subject', isn't a law of logic then?

6.1231 The mark of a logical proposition is not general validity. To be general means no more than to be accidentally valid for all things. An ungeneralized proposition can be tautological just as well as a generalized one.

Yes, #t is a tautology.
6.1232 The general validity of logic might be called essential, in contrast with the accidental general validity of such propositions as ‘All men are mortal’. Propositions like Russell’s ‘axiom of reducibility’ are not logical propositions, and this explains our feeling that, even if they were true, their truth could only be the result of a fortunate accident.

Axiom of Reducibility is in Introduction, Chapter 2, Section 6 of Principia Mathematica.

It roughly says that for every function f: A -> B there exists a predicative version: g: (A,B) -> {0,1}

Well, since in Wittgenstein's theory it is possible to avoid non-binary things in general…
6.1233 It is possible to imagine a world in which the axiom of reducibility is not valid. It is clear, however, that logic has nothing to do with the question whether our world really is like that or not.

"It is possible to imagine a world in which the axiom of reducibility is not valid".

How is it possible? Well, programmatically, it is obvious. You just code in f(a), but not g(a,b). However, writing g having f and equality is trivial. Although Wittgenstein's logic has no equality, but I remember him providing equality as a composite logical operation.

But in "reality", whatever this means, it's absolutely self-evident that if f exists, g exists.

It is possible, though, that f is not computable, while g is computable.

6.124 The propositions of logic describe the scaffolding of the world, or rather they represent it. They have no ‘subject-matter’. They presuppose that names have meaning and elementary propositions sense; and that is their connexion with the world. It is clear that something about the world must be indicated by the fact that certain combinations of symbols — whose essence involves the possession of a determinate character — are tautologies. This contains the decisive point. We have said that some things are arbitrary in the symbols that we use and that some things are not. In logic it is only the latter that express: but that means that logic is not a field in which we express what we wish with the help of signs, but rather one in which the nature of the absolutely necessary signs speaks for itself. If we know the logical syntax of any sign-language, then we have already been given all the propositions of logic.

The last sentence is important. If you have a compiler, you have all the possible logic inside of it. And at least logically, since all compilers are Turin-complete, all logics are roughly the same.

6.125 It is possible — indeed possible even according to the old conception of logic — to give in advance a description of all ‘true’ logical propositions.

Well, if they evaluate to 1?

6.1251 Hence there can never be surprises in logic.

What is a "surprise"?

Well, informationally, logic only squeezes information from the input, it cannot do more.

But for me, certain results are still surprising.

6.126 One can calculate whether a proposition belongs to logic, by calculating the logical properties of the symbol. And this is what we do when we ‘prove’ a logical proposition. For, without bothering about sense or meaning, we construct the logical proposition out of others using only rules that deal with signs. The proof of logical propositions consists in the following process: we produce them out of other logical propositions by successively applying certain operations that always generate further tautologies out of the initial ones. (And in fact only tautologies follow from a tautology.) Of course this way of showing that the propositions of logic are tautologies is not at all essential to logic, if only because the propositions from which the proof starts must show without any proof that they are tautologies.

Again, this suggests that optimisation is a valid area of logic.

6.1261 In logic process and result are equivalent. (Hence the absence of surprise.)

I keep saying that speed is also important. In Prolog, the results comes together with a derivation of that result (if you do not use cut). But, again, computability leaves a place for surprises. Say, you make your computer compute an uncomputable F(x), on some x, and it halts. This is a surprise, isn't it?
6.1262 Proof in logic is merely a mechanical expedient to facilitate the recognition of tautologies in complicated cases.

Yes!
6.1263 Indeed, it would be altogether too remarkable if a proposition that had sense could be proved logically from others, and so too could a logical proposition. It is clear from the start that a logical proof of a proposition that has sense and a proof in logic must be two entirely different things.

One depends on input, and one is not. One is an optimisation technique, another one is a result of working on the input.
6.1264 A proposition that has sense states something, which is shown by its proof to be so. In logic every proposition is the form of a proof. Every proposition of logic is a modus ponens represented in signs. (And one cannot express the modus ponens by means of a proposition.)

I think, he is being desperate here. Many languages have facilities for metaprogramming. Modus ponens eventually makes its way here with the Horn rule.
6.1265 It is always possible to construe logic in such a way that every proposition is its own proof.

Really? There is no way to "fit" a proposition by means of an inferential procedure?

6.127 All the propositions of logic are of equal status: it is not the case that some of them are essentially primitive propositions and others essentially derived propositions. Every tautology itself shows that it is a tautology.

Well, as long as they form a complete set. (In the Godelian sense.)

6.1271 It is clear that the number of the ‘primitive propositions of logic’ is arbitrary, since one could derive logic from a single primitive proposition, e.g. by simply constructing the logical product of Frege’s primitive propositions. (Frege would perhaps say that we should then no longer have an immediately self-evident primitive proposition. But it is remarkable that a thinker as rigorous as Frege appealed to the degree of self-evidence as the criterion of a logical proposition.)

Well, having huge programming languages is usually not a problem. The opposite direction is usually harder.

6.1.3. 6.13 Logic is not a body of doctrine, but a mirror-image of the world. Logic is transcendental.

And still, we develop it. We at least develop new, stronger programming languages and provers.

6.2. 6.2 Mathematics is a logical method. The propositions of mathematics are equations, and therefore pseudo-propositions.

Intuitively it relies on the "axiom of reducibility". I am not sure I understood Wittgenstein's proof of why it is not required, but so be it.

6.2.1. 6.21 A proposition of mathematics does not express a thought.

A thought is a "logical picture of facts". Since logic is outside of the world, it does not consist of facts.

6.211 Indeed in real life a mathematical proposition is never what we want. Rather, we make use of mathematical propositions only in inferences from propositions that do not belong to mathematics to others that likewise do not belong to mathematics. (In philosophy the question, ‘What do we actually use this word or this proposition for?’ repeatedly leads to valuable insights.)

Sure, we use mathematics to predict something in the "world", not for itself. If we want to prove something with agda, we still want social appreciation.

6.2.2. 6.22 The logic of the world, which is shown in tautologies by the propositions of logic, is shown in equations by mathematics.

And, again, it is belived that you can express one in terms of another.

6.2.3. 6.23 If two expressions are combined by means of the sign of equality, that means that they can be substituted for one another. But it must be manifest in the two expressions themselves whether this is the case or not. When two expressions can be substituted for one another, that characterizes their logical form.

Again, this confusion about equality. eq? is not the same as equal?, and an equation is not a law, as it is only true for certain x.

6.231 It is a property of affirmation that it can be construed as double negation. It is a property of ‘1+1+1+1’ that it can be construed as ‘(1+1)+(1+1)’.

Logic, not the world.
6.232 Frege says that the two expressions have the same meaning but different senses. But the essential point about an equation is that it is not necessary in order to show that the two expressions connected by the sign of equality have the same meaning, since this can be seen from the two expressions themselves.

I think this use of the word "equation" is not correct in the modern sense.

I think that Frege is right, and Wittgenstein is wrong.

They "mean the same" for some x. Their "senses" for arbitrary x's are different.
1. 6.2321 And the possibility of proving the propositions of mathematics means simply that their correctness can be perceived without its being necessary that what they express should itself be compared with the facts in order to determine its correctness.
  
  Well, we still do a lot of mathematical experiments.
2. 6.2322 It is impossible to assert the identity of meaning of two expressions. For in order to be able to assert anything about their meaning, I must know their meaning, and I cannot know their meaning without knowing whether what they mean is the same or different.
  
  I think this is again due to the lack of eq? in his system. Or, maybe, recursive eq?.
3. 6.2323 An equation merely marks the point of view from which I consider the two expressions: it marks their equivalence in meaning.
  
  For a particular x.
6.233 The question whether intuition is needed for the solution of mathematical problems must be given the answer that in this case language itself provides the necessary intuition.

I think this works only for algorithmisable problems.
1. 6.2331 The process of calculating serves to bring about that intuition. Calculation is not an experiment.
  
  Yes, there is something wrong about proving statements with brute force.
6.234 Mathematics is a method of logic.

I agree.
1. 6.2341 It is the essential characteristic of mathematical method that it employs equations. For it is because of this method that every proposition of mathematics must go without saying.
  
  Because it is not about the world? Meh, we are still drawing a lot of inspiration from the real world when programming or deriving. We the people are also a part of the world, even though for solipsists it is not so evident…

6.2.4. 6.24 The method by which mathematics arrives at its equations is the method of substitution. For equations express the substitutability of two expressions and, starting from a number of equations, we advance to new equations by substituting different expressions in accordance with the equations.

Well, that's not true, unfortunately. For high school mathematics, may be. But for actual mathematics… no. Although perhaps with automated provers you can do at least a part of work by pure substitution.

6.241 Thus the proof of the proposition 2 × 2 = 4 runs as follows: (Ω^ν)^μ’x = Ω^ν×μ ’x Def., Ω^2×2’x = (Ω^2)^2’x = (Ω^2)¹⁺¹’x = Ω^2’Ω^2’x = Ω¹⁺¹’Ω¹⁺¹’x = (Ω’Ω)’(Ω’Ω)’x = Ω’Ω’Ω’Ω’x = Ω^1+1+1+1’x = Ω^4’x.

Can this really be called "a proof"? And frankly, it seems much more evident if seeing powers as tokens that can be tossed, rather than pure substitution.

6.3. 6.3 The exploration of logic means the exploration of everything that is subject to law. And outside logic everything is accidental.

Law is the behaviour or the compiler. Outside is the input.

6.3.1. 6.31 The so-called law of induction cannot possibly be a law of logic, since it is obviously a proposition with sense. — Nor, therefore, can it be an a priori law.

And, still, I think that it is used axiomatically, in general.

Wikipedia says "Proofs or constructions using induction and recursion often use the axiom of choice to produce a well-ordered relation that can be treated by transfinite induction. However, if the relation in question is already well-ordered, one can often use transfinite induction without invoking the axiom of choice."

So, "often" is not always.

6.3.2. 6.32 The law of causality is not a law but the form of a law.

Form of which law?

6.321 ‘Law of causality’ — that is a general name. And just as in mechanics, for example, there are ‘minimum-principles’, such as the law of least action, so too in physics there are causal laws, laws of the causal form.

Hm… is the idea here only that causality principle can be reformulated in terms of other principles?
1. 6.3211 Indeed people even surmised that there must be a ‘law of least action’ before they knew exactly how it went. (Here, as always, what is certain a priori proves to be something purely logical.)
  
  The law of least action is clearly a logical construct. I am still not convinced that causality is.

6.3.3. 6.33 We do not have an a priori belief in a law of conservation, but rather a priori knowledge of the possibility of a logical form.

We have "first integrals". And since physics obeys the differential equations with extreme precision, conservation laws naturally arise.

6.3.4. 6.34 All such propositions, including the principle of sufficient reason, the laws of continuity in nature and of least effort in nature, etc. etc. — all these are a priori insights about the forms in which the propositions of science can be cast.

Ok, I agree. They first appear as computational tricks, and later are supported by evidence.

6.341 Newtonian mechanics, for example, imposes a unified form on the description of the world. Let us imagine a white surface with irregular black spots on it. We then say that whatever kind of picture these make, I can always approximate as closely as I wish to the description of it by covering the surface with a sufficiently fine square mesh, and then saying of every square whether it is black or white. In this way I shall have imposed a unified form on the description of the surface. The form is optional, since I could have achieved the same result by using a net with a triangular or hexagonal mesh. Possibly the use of a triangular mesh would have made the description simpler: that is to say, it might be that we could describe the surface more accurately with a coarse triangular mesh than with a fine square mesh (or conversely), and so on. The different nets correspond to different systems for describing the world. Mechanics determines one form of description of the world by saying that all propositions used in the description of the world must be obtained in a given way from a given set of propositions—the axioms of mechanics. It thus supplies the bricks for building the edifice of science, and it says, ‘Any building that you want to erect, whatever it may be, must somehow be constructed with these bricks, and with these alone.’ (Just as with the number-system we must be able to write down any number we wish, so with the system of mechanics we must be able to write down any proposition of physics that we wish.)

So, mechanics is not considered to be science here, right? It is mathematics, that later forces the input to be interpreted in a certain (inclined to differential equations) way?
6.342 And now we can see the relative position of logic and mechanics. (The net might also consist of more than one kind of mesh: e.g. we could use both triangles and hexagons.) The possibility of describing a picture like the one mentioned above with a net of a given form tells us nothing about the picture. (For that is true of all such pictures.) But what does characterize the picture is that it can be described completely by a particular net with a particular size of mesh. Similarly the possibility of describing the world by means of Newtonian mechanics tells us nothing about the world: but what does tell us something about it is the precise way in which it is possible to describe it by these means. We are also told something about the world by the fact that it can be described more simply with one system of mechanics than with another.

I am thinking that he is missing the interpretation off output here.

So, mechanics tells us nothing about the world until we run our mechanical model, written in the language of logic, on the input… and compare with the measured results.

If the result is precise – the description is good and the world is mechnical.
6.343 Mechanics is an attempt to construct according to a single plan all the true propositions that we need for the description of the world.

That's like, the Hilbert's 6th problem? We would still need to digitise the input, but if we have a complete multiphysics simulator, we would be able to model a world.
1. 6.3431 The laws of physics, with all their logical apparatus, still speak, however indirectly, about the objects of the world.
  
  Is "physics" here notably different from "mechanics"?
2. 6.3432 We ought not to forget that any description of the world by means of mechanics will be of the completely general kind. For example, it will never mention particular point-masses: it will only talk about any point-masses whatsoever.
  
  At least F=mg should be replaced with Gmm/r^2.

6.3.5. 6.35 Although the spots in our picture are geometrical figures, nevertheless geometry can obviously say nothing at all about their actual form and position. The network, however, is purely geometrical; all its properties can be given a priori. Laws like the principle of sufficient reason, etc. are about the net and not about what the net describes.

Again, here mechanics is seen as a sufficient development of the laws of logic (of moving bodies).

6.3.6. 6.36 If there were a law of causality, it might be put in the following way: There are laws of nature. But of course that cannot be said: it makes itself manifest.

Cannot be said, I belive, here means "cannot be digitised". Indeed, you program the logical system, and causality is an intrinsic property off this system. You cannot "declare" the system to be causal, because it is the way you are writing it.

6.361 One might say, using Hertz’s terminology, that only connexions that are subject to law are thinkable.

Well, "thinkable" here, I guess, should be seen as "computable". There are not connections other than those you write in the code.

Who is the Hertz he is writing about? Same Heinrich Hertz, a physicist?
1. 6.3611 We cannot compare a process with ‘the passage of time’ — there is no such thing — but only with another process (such as the working of a chronometer). Hence we can describe the lapse of time only by relying on some other process. Something exactly analogous applies to space: e.g. when people say that neither of two events (which exclude one another) can occur, because there is nothing to cause the one to occur rather than the other, it is really a matter of our being unable to describe one of the two events unless there is some sort of asymmetry to be found. And if such an asymmetry is to be found, we can regard it as the cause of the occurrence of the one and the non-occurrence of the other.
  
  Well, our input is immutable, so there is no time other than machine time. The machine can count cycles of computation, but suppose it is hibernated, or even unplugged (NVRAM machine, obviously).
  
  I think this clause is due to the fact that in computing time is ill-defined.
  1. 6.36111 Kant’s problem about the right hand and the left hand, which cannot be made to coincide, exists even in two dimensions. Indeed, it exists in one-dimensional space - - - O—(a)—X - - X—(b)—O - - - - in which the two congruent figures, a and b, cannot be made to coincide unless they are moved out of this space. The right hand and the left hand are in fact completely congruent. It is quite irrelevant that they cannot be made to coincide. A right-hand glove could be put on the left hand, if it could be turned round in four-dimensional space.
    
    I think that here Wittgenstein is trying to approach the problem of verifiability. Logical correctness should be established by comparing with "other digitization", or with "other net", and this glove example is just an example of something hugely disparate.
6.362 What can be described can happen too: and what the law of causality is meant to exclude cannot even be described.

Because everything we are discussing must be digitised.

He is a bit manipulative here – his "can happen" should mean "can happen in a machine".

Cannot be describe means "our model does not support that".
6.363 The procedure of induction consists in accepting as true the simplest law that can be reconciled with our experiences.

Less code -> less errors.
1. 6.3631 This procedure, however, has no logical justification but only a psychological one. It is clear that there are no grounds for believing that the simplest eventuality will in fact be realized.
  
  Well, when you find a discrepancy, you implement more code.
  1. 6.36311 It is an hypothesis that the sun will rise tomorrow: and this means that we do not know whether it will rise.
    - (Son) Dad, why does the sun rise in the East?
    - (Father, programmer) Have you checked?
    - (Son) Yes, Dad, multiple times.
    - (Father) Still works?
    - (Son) Yes.
    - (Father) Please, please, do not touch anything!

6.3.7. 6.37 There is no compulsion making one thing happen because another has happened. The only necessity that exists is logical necessity.

Because that is how models work.

6.371 The whole modern conception of the world is founded on the illusion that the so-called laws of nature are the explanations of natural phenomena.

In fact, they do not really exist, but are just programs we write to predict further observations.
6.372 Thus people today stop at the laws of nature, treating them as something inviolable, just as God and Fate were treated in past ages. And in fact both are right and both wrong: though the view of the ancients is clearer in so far as they have a clear and acknowledged terminus, while the modern system tries to make it look as if everything were explained.

God is a scientific hypothesis, but it has very bad explanatory power, as it is essentially a huge dictionary.

The modern system is much straightforward algorithmically, but has better predictive power.
6.373 The world is independent of my will.

Ok.
6.374 Even if all that we wish for were to happen, still this would only be a favour granted by fate, so to speak: for there is no logical connexion between the will and the world, which would guarantee it, and the supposed physical connexion itself is surely not something that we could will.

Well, we can work to make wishes happen.

That is, if free will exists.

However, free will does not exist in a machine.
6.375 Just as the only necessity that exists is logical necessity, so too the only impossibility that exists is logical impossibility.

Well, this means that we should be able to find a certain input that makes our dreams happen (possible in the model).

But that is only in the machine.
1. 6.3751 For example, the simultaneous presence of two colours at the same place in the visual field is impossible, in fact logically impossible, since it is ruled out by the logical structure of colour. Let us think how this contradiction appears in physics: more or less as follows — a particle cannot have two velocities at the same time; that is to say, it cannot be in two places at the same time; that is to say, particles that are in different places at the same time cannot be identical. (It is clear that the logical product of two elementary propositions can neither be a tautology nor a contradiction. The statement that a point in the visual field has two different colours at the same time is a contradiction.)
  
  Well, I can't avoid nitpicking on the Bose-Einstein condensate (because the model is different).
  
  But yes, in classical mechanics this is prohibited by the digitisation model.

6.4. 6.4 All propositions are of equal value.

What is even a "value" in this case?

6.4.1. 6.41 The sense of the world must lie outside the world. In the world everything is as it is, and everything happens as it does happen: in it no value exists—and if it did exist, it would have no value. If there is any value that does have value, it must lie outside the whole sphere of what happens and is the case. For all that happens and is the case is accidental. What makes it non-accidental cannot lie within the world, since if it did it would itself be accidental. It must lie outside the world.

I guess, this is a reference to the free will again. A machine does not assign value to the simulation. Operators may want certain outcomes (value them more), and thus try to find conditions that satisfy them.

6.4.2. 6.42 So too it is impossible for there to be propositions of ethics. Propositions can express nothing that is higher.

Like, "A is good" is not a proposition, in the general sense, because good and evil are subjective.

I believe, it is possible to digitise some ethical system and make inferences, but since you cannot compare it with ground truth, it is meaningless.

6.421 It is clear that ethics cannot be put into words. Ethics is transcendental. (Ethics and aesthetics are one and the same.)

I can't say it is "clear", but this certainly seems to be the case. Subjective -> unverifiable.
6.422 When an ethical law of the form, ‘Thou shalt . . .’, is laid down, one’s first thought is, ‘And what if I do not do it?’ It is clear, however, that ethics has nothing to do with punishment and reward in the usual sense of the terms. So our question about the consequences of an action must be unimportant. — At least those consequences should not be events. For there must be something right about the question we posed. There must indeed be some kind of ethical reward and ethical punishment, but they must reside in the action itself. (And it is also clear that the reward must be something pleasant and the punishment something unpleasant.)

Yeah, again, because reward and punishment are not ethical per se, and feelings of good and bad are not verifiable and ill-defined.

Again, the main conclusion from this statement is that modelling ethics on a computer is unlikely to be successful.
6.423 It is impossible to speak about the will in so far as it is the subject of ethical attributes. And the will as a phenomenon is of interest only to psychology.

There are two "will"s. One is "want", the other "actions".

Wanting cannot be classified as good or bad, it is intrinsic, actions are not even obvious to be a subject of "free will".

So, his position here is that logically it is unlikely to condense any useful prediction from "digitisation of ethics or will".

6.4.3. 6.43 If the good or bad exercise of the will does alter the world, it can alter only the limits of the world, not the facts — not what can be expressed by means of language. In short the effect must be that it becomes an altogether different world. It must, so to speak, wax and wane as a whole. The world of the happy man is a different one from that of the unhappy man.

That is almost the philosophical treatment of set! in Scheme.

Mutation creates a different world.

Maybe we can even expect the "digitised input" of a human brain to be different from that of an unhappy man.

Again, emotions destroy predictability to a large extent.

In any case, all of that is not need if there is no free will, and people are just computers.

6.431 So too at death the world does not alter, but comes to an end.

The simulation ends. Each time you turn off your computer, you are destroying a world.
1. 6.4311 Death is not an event in life: we do not live to experience death. If we take eternity to mean not infinite temporal duration but timelessness, then eternal life belongs to those who live in the present. Our life has no end in just the way in which our visual field has no limits.
  
  Well, time is in general ill-defined in computers. If you freeze you program in a debugger, it kind of exists forever.
2. 6.4312 Not only is there no guarantee of the temporal immortality of the human soul, that is to say of its eternal survival after death; but, in any case, this assumption completely fails to accomplish the purpose for which it has always been intended. Or is some riddle solved by my surviving for ever? Is not this eternal life itself as much of a riddle as our present life? The solution of the riddle of life in space and time lies outside space and time. (It is certainly not the solution of any problems of natural science that is required.)
  
  So, the point here is that we all may be living in a simulation, which something outside of this computer is running to model/predict something.
  
  Or, rather that we cannot disprove the opposite.
6.432 How things are in the world is a matter of complete indifference for what is higher. God does not reveal himself in the world.

Just as the memory state is not aware of human interventions in a debugger. (There are different anti-debugging tricks though!)
1. 6.4321 The facts all contribute only to setting the problem, not to its solution.
  
  Because logic is not a part of the input. (Again, Lisp provides self-mutating programs, so maybe this is a bit obsolete.)

6.4.4. 6.44 It is not how things are in the world that is mystical, but that it exists.

Well, we do not know and by the laws of logic cannot know what is the machine that is running the simulation.

6.4.5. 6.45 To view the world sub specie aeterni is to view it as a whole — a limited whole. Feeling the world as a limited whole — it is this that is mystical.

"This book is dedicated, in respect and admiration, to the spirit that lives in the computer."

“I think that it’s extraordinarily important that we in computer science keep fun in computing. When it started out, it was an awful lot of fun. Of course, the paying customers got shafted every now and then, and after a while we began to take their complaints seriously. We began to feel as if we really were responsible for the successful, error-free perfect use of these machines. I don’t think we are. I think we’re responsible for stretching them, setting them off in new directions, and keeping fun in the house. I hope the field of computer science never loses its sense of fun. Above all, I hope we don’t become missionaries. Don’t feel as if you’re Bible salesmen. The world has too many of those already. What you know about computing other people will learn. Don’t feel as if the key to successful computing is only in your hands. What’s in your hands, I think and hope, is intelligence: the ability to see the machine as more than when you were first led up to it, that you can make it more.”

—Alan J. Perlis (April 1, 1922 – February 7, 1990)

6.5. 6.5 When the answer cannot be put into words, neither can the question be put into words. The riddle does not exist. If a question can be framed at all, it is also possible to answer it.

Again, because all is just an evolution of a machine memory.

Although, I think, Wittgenstein is, again, unaware of undecidability.

6.5.1. 6.51 Scepticism is not irrefutable, but obviously nonsensical, when it tries to raise doubts where no questions can be asked. For doubt can exist only where a question exists, a question only where an answer exists, and an answer only where something can be said.

Nonsensical here is not a negative characteristic.

Doubt means "being unsure your program is without errors".

Scepticism, I guess, whether you have digitised your input correctly. And this is nonsensical, because incorrectness of digitisation cannot be distinguished from incorrectness of logic.

6.5.2. 6.52 We feel that even when all possible scientific questions have been answered, the problems of life remain completely untouched. Of course there are then no questions left, and this itself is the answer.

So, he believes that it is possible to make a complete logical model of the world. (Which itself has been disproved.)

And the answer is that life has no "problems of life", those are meaningless.

But due to incompleteness of mathematics, there will always be problems to prove.

6.521 The solution of the problem of life is seen in the vanishing of the problem. (Is not this the reason why those who have found after a long period of doubt that the sense of life became clear to them have then been unable to say what constituted that sense?)

Why do we program? Because it is fun?

Anyway, it is not about logic and is unlikely to be answered in precise statements.
6.522 There are, indeed, things that cannot be put into words. They make themselves manifest. They are what is mystical.

Machine learning is (still) an example of such a thing. Clearly works, and is humiliatingly unclear about how it actually works.

If we explain them at some point, they will stop being mysterious.

6.5.3. 6.53 The correct method in philosophy would really be the following: to say nothing except what can be said, i.e. propositions of natural science — i.e. something that has nothing to do with philosophy — and then, whenever someone else wanted to say something metaphysical, to demonstrate to him that he had failed to give a meaning to certain signs in his propositions. Although it would not be satisfying to the other person — he would not have the feeling that we were teaching him philosophy — this method would be the only strictly correct one.

And it is the way to digitise the world.

The main point off this clause is not the denigration of non-natural sciences, but rather the idea that evern soft science, via the philosophical method, can be digitised sufficiently to be described in a programming language.

And then it becomes a natural science and can be treated accordingly.

6.5.4. 6.54 My propositions serve as elucidations in the following way: anyone who understands me eventually recognizes them as nonsensical, when he has used them — as steps — to climb up beyond them. (He must, so to speak, throw away the ladder after he has climbed up it.) He must transcend these propositions, and then he will see the world aright.

Again, nonsensical here is not a denigrating characterisation. Nonsensical means that there is no formal logic that describes the philosophical method of the Tractatus.

That could have been a logic of the machine that is simulating us.

However, remember, a machine may have cpuid, cpuinfo, and similar instructions. But the programmer knows which part of memory corresponds to them. For robots, they are indistinguishable from random places in memory. The Tractatus may be the cpuid.

7. 7 What we cannot speak about we must pass over in silence.

I wonder how many people have written a commentary as extensive as mine, and had it deleted after reading this statement.

For me this clause is perfectly clear: do not try to create "digital court jury", "digital moral code", "aesthetics assessment algorithms", and such. Those things are subjective and ill-defined.

It's not that you cannot "create some incomplete model" of those domains. It is that those programs will start to eventually annoy your users by being outrageously wrong and exploitable. Like those tiny grids that you can glue onto your car's license plate to make it completely incomprehensible to image recognition systems.

All those "StackOverflow" "Rating, Carma, Voting"; Habr's "Carma, Rating", Facebook's "Likes" and "Emotions", are going to become humiliatingly fake and representing nothing very soon after being implemented.

8. The End

Lockywolf <2021-06-01 Tue 18:22>

This review took 36 working hours.