<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	xmlns:georss="http://www.georss.org/georss" xmlns:geo="http://www.w3.org/2003/01/geo/wgs84_pos#" xmlns:media="http://search.yahoo.com/mrss/"
	>

<channel>
	<title>Jellymatter</title>
	<atom:link href="http://jellymatter.com/feed/" rel="self" type="application/rss+xml" />
	<link>http://jellymatter.com</link>
	<description>The blog that is not afraid of equations... or bees</description>
	<lastBuildDate>Wed, 19 Jun 2013 09:29:24 +0000</lastBuildDate>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.com/</generator>
<cloud domain='jellymatter.com' port='80' path='/?rsscloud=notify' registerProcedure='' protocol='http-post' />
<image>
		<url>http://0.gravatar.com/blavatar/8e679876af1bc1410a95bf2305dfcae4?s=96&#038;d=http%3A%2F%2Fs2.wp.com%2Fi%2Fbuttonw-com.png</url>
		<title>Jellymatter</title>
		<link>http://jellymatter.com</link>
	</image>
	<atom:link rel="search" type="application/opensearchdescription+xml" href="http://jellymatter.com/osd.xml" title="Jellymatter" />
	<atom:link rel='hub' href='http://jellymatter.com/?pushpress=hub'/>
		<item>
		<title>Friston&#8217;s Free Energy for Dummies</title>
		<link>http://jellymatter.com/2013/06/01/fristons-free-energy-for-dummies/</link>
		<comments>http://jellymatter.com/2013/06/01/fristons-free-energy-for-dummies/#comments</comments>
		<pubDate>Sat, 01 Jun 2013 04:49:55 +0000</pubDate>
		<dc:creator>Lucas Wilkins</dc:creator>
				<category><![CDATA[Actual Science]]></category>
		<category><![CDATA[Opinion]]></category>
		<category><![CDATA[Potentially Useful Stuff]]></category>
		<category><![CDATA[Reviews]]></category>
		<category><![CDATA[bayesian brain]]></category>
		<category><![CDATA[brains]]></category>
		<category><![CDATA[dynamical systems]]></category>
		<category><![CDATA[free energy]]></category>
		<category><![CDATA[Friston]]></category>
		<category><![CDATA[information]]></category>
		<category><![CDATA[information theory]]></category>
		<category><![CDATA[neuroscience]]></category>
		<category><![CDATA[statistics]]></category>
		<category><![CDATA[surprisal]]></category>

		<guid isPermaLink="false">http://jellymatter.com/?p=3851</guid>
		<description><![CDATA[People always want an explanation of Friston&#8217;s Free Energy that doesn&#8217;t have any maths. This is quite a challenge, but I hope I have managed to produce something comprehensible. This is basically a summary of Friston&#8217;s Entropy paper (available here). A friend of jellymatter was instrumental in its production, and for this reason I am [&#8230;]<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=jellymatter.com&#038;blog=19982511&#038;post=3851&#038;subd=jellymatter&#038;ref=&#038;feed=1" width="1" height="1" />]]></description>
				<content:encoded><![CDATA[<p>People always want an explanation of Friston&#8217;s Free Energy that doesn&#8217;t have any maths. This is quite a challenge, but I hope I have managed to produce something comprehensible.</p>
<p>This is basically a summary of Friston&#8217;s <a href="http://www.mdpi.com/journal/entropy">Entropy</a> paper (<a href="http://www.fil.ion.ucl.ac.uk/~karl/A%20Free%20Energy%20Principle%20for%20Biological%20Systems.pdf">available here</a>). A friend of jellymatter was instrumental in its production, and for this reason I am fairly confident that my summary is going in the right direction, even if I have not emphasised exactly the same things as Friston.</p>
<p>I&#8217;ve made a point of writing this without any maths, and I have highlighted what I consider to be the main assumptions of the paper and maked them with a <strong>P</strong>.</p>
<p><span id="more-3851"></span></p>
<p><strong>Optimisation Theories</strong></p>
<p>The basic structure of Fristons theory is not particularly unusual, it is one of many theories that basically work by assuming that something is optimised. In this case, a &#8220;free energy&#8221; (at bit like the physics one). Much of modern physics is based on some kind of optimisation: in mechanics one minimises an <a href="http://en.wikipedia.org/wiki/Lagrangian_mechanics">action</a>, and in thermodynamics one minimises the (thermodynamic) free energy. Further afield it is found in economic decision making: where a risk is minimised; and in population dynamics: where, depending on ones interpretation, fitness is maximised.</p>
<p>In fact, all theories, right or wrong, can be formulated in terms of an optimisation. The theory: &#8220;Summer is warmer than winter&#8221; could be expressed as: &#8220;physics works so as to maximise summer temperature minus winter temperature&#8221;. Though usually, we do not speak in this way all the time.</p>
<p>Formulating a problems in this was always raises a question: &#8220;why should this quantity be optimised?&#8221; and there are two distinct but non-exclusive responses. (1) Some other theory suggests that it should be optimised; and (2) it is an elegant summary of other models.</p>
<p>Risk and fitness optimisation are of the first type: in economics, the notion of utility justifies* the minimisation risk and the notion of <a href="http://en.wikipedia.org/wiki/Gene-centered_view_of_evolution">replicators</a> is often used motivate the maximisation of fitness. Contrariwise, the action principles of physics are of the second type, they unify various principles in one coherent and elegant framework.</p>
<p>The argument that Friston provides addresses both. In the first case, it is motivated by a notion that agents are &#8220;coherent&#8221; in some sense (active systems). In the second, it generalises a number of concepts in machine learning and statistical inference.</p>
<p>Here, I will not worry about it&#8217;s ability to generalise mathematical theorems, but attempt to restate the argument for it following from biological principles. Friston&#8217;s presentation is usually aimed at those who prefer overarching, general, mathematical theories, but this seems to me the source of many of the difficulties that people have when trying to understand it.</p>
<p>Importantly, it is the justification according to other biological theories that is matters to the non-theoretician. It is this which matters to them when deciding whether or not to attempt an understanding of the mathematical details.</p>
<p><strong>Agent and Environment</strong></p>
<p>The model beings with a sensory-motor feedback loop. The agent affects its environment and the environment affects the agent &#8211; these are modeled as physical systems. In the environment there is &#8220;added noise&#8221;, and this noise motivates us to talk about probabilities and information.</p>
<p style="text-align:left;padding-left:30px;"><strong>P1</strong>: <em>The internal states of an organism react deterministically on random (but correlated) sensory information from environment.</em></p>
<p>Because of the noise implicit in the environment, the environment behaves randomly. And because of this the agent &#8211; who&#8217;s state is affected by the environment &#8211; behaves randomly in a corresponding fashion. Which then affects the environment with (delayed and transformed) noise, which affects the agent&#8230; etc. etc. The consequence of all of this is that there is a now a probability distribution over the possible physical states of the agent and the environment.</p>
<p style="text-align:left;padding-left:30px;"><strong>P2</strong>: <em>Organisms act against the environment&#8217;s randomness, towards being in a definite state.</em></p>
<p>The stated motivation for this is homeostasis: we act so as to not decay into disordered molecules, by eating, avoiding danger, not exploding, and what have you.</p>
<p>The most obvious (but not the only) way of measuring the amount of randomness is by talking about <em>entropy</em>, or in this case, the <em>surprisal</em> which for the purposes here we can consider to be the same thing as entropy. The surprisal measures the number of states that the system can be in. If it is low, the system stays in one of very few states, if it is high, the system is in one of many. So, the self-maintaining-ness/homeostaticity/distinctness is measured by the surprisal, which, according to Friston, should be minimised.</p>
<p>At first glance it may seem that one should apply this measure to the organisms internal states, but it turns out that this doesn&#8217;t work. For example: A rock would have very definite states and a low surprisal. Instead, the proposed solution is for the organism to instead minimise the surprisal associated with the <em>external</em> world* &#8211; he calls systems that do this <em>active systems</em>.</p>
<p>An active system acts as to make its sensory inputs as predicable and unsurprising as possible. This means we can make a modification to <strong>P2</strong>:</p>
<p style="text-align:left;padding-left:30px;"><strong>P2b</strong>: <em>Organisms act against the environment&#8217;s randomness, towards obtaining sensory evidence that suggests that they are in a well defined definite state.</em></p>
<p>Doing this solves the rock problem. And because the randomness of the inputs affects the internal state, it is measuring a very similar thing.</p>
<p>Depending on ones philosophy <strong>P2b</strong> may be either a refinement or a change to <strong>P2</strong>. Ones opinion about this is crucial for deciding if the notion of homeostasis is a justification for this theory. Either way, with <strong>P2b</strong> we still have a problem much like the rock example from before. Sitting in a dark room with your fingers in your ears would be an excellent way of minimising surprisal &#8211; and we obviously don&#8217;t do that. Much.</p>
<p>The optimality of sensory deprivation can be seen as a motivational problem for free energy, but first I must go though some stuff about inference.</p>
<p><strong>Best Inference</strong></p>
<p style="padding-left:30px;"><strong>P3</strong>:<em> Organisms make good inferences.</em></p>
<p>Organisms can be considered to make inferences, and making good inferences has different requirements to the notion of surprise as described above.</p>
<p>To discuss inference we must introduce some more probability distributions. Let&#8217;s say that the probability of sensory inputs are determined by things that we can&#8217;t directly observe. If we observe a coin landing on heads five times and tails five times, then we can make <em>inferences</em> about some hidden parameter that has a value somewhere around one half &#8211; a statistician would say the outcome of a fair coin flip is drawn from a <a href="http://en.wikipedia.org/wiki/Bernoulli_distribution">Bernoulli distribution</a> with a parameter of 1/2. The more we observe the coin landing, the more sure we can be about the parameter.</p>
<p>Of course in this example the choice of parameter has no obvious physical basis: one could easily choose another, related parameter &#8211; say by adding one, squaring it, taking the logarithm etc &#8211; and have it describe the same coin but in a different way. The choice of parameter is kind of arbitrary; and it is for this reason that Friston describes them as <em>fictive</em>. We can choose them how we like as long as they describe the same thing.</p>
<p>The fictive parameters are used to model the world. Consider a brick. This brick could have parameters width, height and depth. Associated with the parameters there would a confidence in each paramter: there is a probability that the brick is between 9 and 10cm long, 8 and 9 cm, 7 and 8, as well as for any other pair of numbers, 8.51234&#8230; and 8.51333&#8230; or whatever. But, we needn&#8217;t have chosen length, width, and height &#8211; we could have equally chosen surface area, volume and perimeter, we could still describe the same brick, and with these there would be a corresponding probability distribution which you could work out from the former.</p>
<p>It is both the power and the failure of information theory that it talks about probabilities with complete indifference for what they are.</p>
<p>Because the parameters don&#8217;t necessarily have any specific meaning or interpretation, at this point we simply forget about trying to work out what they are or what they mean &#8211; all we care about is <em>the thing they describe</em>. Friston argues, that whatever they happen to be we can still talk about them abstractly. The mathematical tools he uses are then chosen so as to make this aspect of his probabilities unproblematic (measure invariance).</p>
<p>The probability of the fictive parameters comes in two main flavors. One, the probability of the parameters as determined by the state of the sensory system (an &#8220;objective&#8221; probability in some sense), related to Friston&#8217;s <em>&#8220;generative model&#8221;</em>. The other, the probability of the parameters as determined by an internal model (the subjects probability), which Friston calls a <em>&#8220;proposal density&#8221;</em>. The former, the world as it is best described, the latter, the product of attempt for an organism to describe it.</p>
<p>The main idea of making inferences is that the organism tries to find the probability distribution (proposal density) that best matches the &#8220;real&#8221; probability distribution (generative model). The better the internal model is the better it matches the world. The better the model model matches the world, the better it is for making predictions about it.</p>
<p>Choosing to minimise the difference between probability distributions entails lots of things that people want from inferential systems, such as <a href="http://en.wikipedia.org/wiki/Principle_of_Maximum_Entropy">maximum entropy principle</a>**. To measure this difference, Friston uses a standard tool usual: the <a href="http://en.wikipedia.org/wiki/Kullback%E2%80%93Leibler_divergence">Kullback-Leibler divergence</a>. There are plenty of other measures that he could have used, but this one is usually preferred by information theory and Bayesian types.</p>
<p><strong>Conflict Between Surprise and Inference</strong></p>
<p>The next step is to acknowledge that both inferential ability and surprisal minimisation are &#8220;good&#8221;, and define a quantity that is measures the goodness. Depending on how you go about making this quantity, you end up with different things. But the quantity that is nice to work with once one has settled on the surprise and the Kullback-Liebler divergence is Friston&#8217;s <em>free energy</em>. This basically just adds the two together. The result is certainly elegant, but there is no motivation for this particular form beyond mathematical tractability (the reason for the mathematical niceness is the subject of information geometry).</p>
<p>Importantly, when optimal inference and minimising surprisal are mutually exclusive, minimising the free energy minimises both of them. This is an <a href="http://en.wikipedia.org/wiki/Ceteris_paribus">&#8220;all things being equal&#8221;</a> justification:</p>
<p style="padding-left:30px;"><strong>P4</strong>: <em>If it were possible, an organism would minimise both the surprisal and and the inferential &#8220;error&#8221; of their prediction independently of each other.</em></p>
<p>But no things are ever equal. In practice, the two quantities are not mutually exclusive. This is because the &#8220;subject&#8221; probabilty must have some physical basis in the internal states of the organism, and is thereby constrained by this physicality. This is essentially the idea that the brain represents probabilities, and is what Friston calls <em>entailment</em>. The internal states are the same thing as the subjects probability distribution but viewed, as it were, through a different lens. The exact manner in which the brain states are mapped to probabilities is not discussed directly, but there is an implicit notion that the brain cannot represent just any old probability distribution.</p>
<p>The usual consequence of <em>entailment</em> is that it is no longer possible to simultaneously minimise surprise and maximise inference, and instead, there is a trade off between the two:</p>
<p style="padding-left:30px;"><strong>P5</strong>: <em>Due to the physical nature of organisms, maximising inferential ability and minimising surprise are in conflict with each other.</em></p>
<p>This motivates the need for free energy as one singular quantity, rather than two separate ones. It is also how one solves the &#8220;dark room with fingers in ears&#8221; problem, though for a slightly technical reason: Implicit in the formalisation of maximised inferential ability is the notion that making the best inferences about lots of things is better than making the best inferences about less things. Whilst in the state of sensory deprivation I mentioned one can make rather good inferences about you see, but one cannot make inferences about other things that you would if you opened your eyes and took your fingers out of your ears***.</p>
<p>The relationship between the internal states and the internal/subject probabilities is of fundamental importance. It is the very heart of what Friston calls the free energy <em>principle</em>. In this he elaborates on the nature of the physical constraint on the probability distribution it encodes. Basically, the constraint is simply the number of states that the brain can be in. So, we have a motivation for <strong>P5</strong></p>
<p style="padding-left:30px;"><strong>P6</strong>: <em>The world is very big, the brain is relatively small. The brain just does not have the capacity to match the complexity of the real world as provided in sensation.</em></p>
<p>Between them <strong>P1</strong>-<strong>P6</strong> are the main motivations for the free energy in what I called type (1) terms &#8211; those that do not appeal to the generality of the theory. I hope they provide a good outline of the reasons for using free energy.</p>
<p>I will skip the applications sections of the paper as they are of the other kind. As I said, whilst these are important for those theoreticians who already use those techniques, it is not my concern here.</p>
<p><strong>Summary</strong></p>
<p>So, to summarise the notion of free energy: it is one way that one may quantify the trade off between making ones environment predictable, and the ability to make predictions where you cannot. The quantity is chosen so as to fit easily with established formalisms in information theory and Bayesian probability.</p>
<p>Throughout this summary I have omitted the mathematical assumptions such as additivity and the relevance of the surprisal and KL-divergence as I do not think including these helps with readability. But we must not forget that they are fundamental to how the particular form of the free energy is formulated. The wordy argument I have presented applies equally to all other free-energy like formulations.</p>
<p><strong>Footnotes</strong></p>
<p>* However, the internal and external surprisals are related because the state of the environment determines the internal state of the agent.</p>
<p>** Interestingly, the same version of the maximum entropy principle is also a property of Friston&#8217;s free energy. One should probably view Friston&#8217;s <em>corollary 3</em> as showing that using free energy does not break this property of the KL divergence. Also, there are some very interesting results in information geometry along exactly these lines.</p>
<p>*** This raises obvious questions about what fictive variables are valid, but I shall skip over this problem mentioning only that it is a problem that may potentially be solved empirically.</p>
<br />  <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/jellymatter.wordpress.com/3851/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/jellymatter.wordpress.com/3851/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=jellymatter.com&#038;blog=19982511&#038;post=3851&#038;subd=jellymatter&#038;ref=&#038;feed=1" width="1" height="1" />]]></content:encoded>
			<wfw:commentRss>http://jellymatter.com/2013/06/01/fristons-free-energy-for-dummies/feed/</wfw:commentRss>
		<slash:comments>8</slash:comments>
	
		<media:content url="http://1.gravatar.com/avatar/d2d876b3bbbb995d9d47e32a639a6533?s=96&#38;d=identicon&#38;r=G" medium="image">
			<media:title type="html">jellymatter</media:title>
		</media:content>
	</item>
		<item>
		<title>Why (: is upside down</title>
		<link>http://jellymatter.com/2013/04/26/why-is-upside-down/</link>
		<comments>http://jellymatter.com/2013/04/26/why-is-upside-down/#comments</comments>
		<pubDate>Fri, 26 Apr 2013 20:00:00 +0000</pubDate>
		<dc:creator>Lucas Wilkins</dc:creator>
				<category><![CDATA[Actual Science]]></category>
		<category><![CDATA[Opinion]]></category>
		<category><![CDATA[:)]]></category>
		<category><![CDATA[eye tracking]]></category>
		<category><![CDATA[gaze]]></category>
		<category><![CDATA[orientation]]></category>
		<category><![CDATA[smileys]]></category>

		<guid isPermaLink="false">http://jellymatter.com/?p=3837</guid>
		<description><![CDATA[I think it is fairly intuitive that the smiley in the title is upside down. But why is this? Generally, when we look at a face we look at the eyes first. These days it is pretty easy to track where people look, the equipment is cheap and easily available &#8211; one simply uses a [&#8230;]<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=jellymatter.com&#038;blog=19982511&#038;post=3837&#038;subd=jellymatter&#038;ref=&#038;feed=1" width="1" height="1" />]]></description>
				<content:encoded><![CDATA[<p>I think it is fairly intuitive that the smiley in the title is upside down. But why is this?</p>
<p><img src="http://jellymatter.files.wordpress.com/2013/04/green_smiley.jpg?w=320&#038;h=292" alt="green_smiley" width="320" height="292" class="aligncenter size-full wp-image-3845" border="0" /></p>
<p>Generally, when we look at a face we look at the eyes first. These days it is pretty easy to track where people look, the equipment is cheap and easily available &#8211; one simply uses a camera to look at the pupil and then calculates where the subject is looking. A preference for beginning with the eyes is a widely observed phenomenon (<a href="http://link.springer.com/content/pdf/10.3758%2FBF03196306.pdf">&#8216;eyes are special&#8217;</a>).</p>
<p>English readers, like readers in most languages, scan left to right when reading. With reading we constantly train ourselves to prefer moving left to right, something which leads to a phenomenon often called a readers bias (<a href="http://eprints.lincoln.ac.uk/2423/1/LGB-revision1109.pdf">see this</a>). It is not only during reading that the direction from left to right is preferred.</p>
<p>So, it is not surprising that we should think that smilies with eyes on the left are correct: left to right is preferred for reading, and eyes to mouth is preferred for viewing faces.</p>
<br />  <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/jellymatter.wordpress.com/3837/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/jellymatter.wordpress.com/3837/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=jellymatter.com&#038;blog=19982511&#038;post=3837&#038;subd=jellymatter&#038;ref=&#038;feed=1" width="1" height="1" />]]></content:encoded>
			<wfw:commentRss>http://jellymatter.com/2013/04/26/why-is-upside-down/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
	
		<media:content url="http://1.gravatar.com/avatar/d2d876b3bbbb995d9d47e32a639a6533?s=96&#38;d=identicon&#38;r=G" medium="image">
			<media:title type="html">jellymatter</media:title>
		</media:content>

		<media:content url="http://jellymatter.files.wordpress.com/2013/04/green_smiley.jpg" medium="image">
			<media:title type="html">green_smiley</media:title>
		</media:content>
	</item>
		<item>
		<title>Cybernetic music</title>
		<link>http://jellymatter.com/2012/11/29/cybernetic-music/</link>
		<comments>http://jellymatter.com/2012/11/29/cybernetic-music/#comments</comments>
		<pubDate>Thu, 29 Nov 2012 04:12:47 +0000</pubDate>
		<dc:creator>Nathaniel Virgo</dc:creator>
				<category><![CDATA[Random Stuff]]></category>
		<category><![CDATA[cybernetics]]></category>
		<category><![CDATA[feedback]]></category>
		<category><![CDATA[music]]></category>
		<category><![CDATA[oscillators]]></category>
		<category><![CDATA[supercollider]]></category>

		<guid isPermaLink="false">http://jellymatter.com/?p=3581</guid>
		<description><![CDATA[Is it okay to post art on a science blog? Well this is kind of science, so I guess it&#8217;s kind of okay. Here is a litte piece of computer-generated music that I created yesterday: As Twitter user @DanieleTatti noted, it sounds like a sort of Scottish raga. But what I wanted to post about [&#8230;]<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=jellymatter.com&#038;blog=19982511&#038;post=3581&#038;subd=jellymatter&#038;ref=&#038;feed=1" width="1" height="1" />]]></description>
				<content:encoded><![CDATA[<p>Is it okay to post art on a science blog? Well this is kind of science, so I guess it&#8217;s kind of okay.</p>
<p>Here is a litte piece of computer-generated music that I created yesterday:</p>
<object width="100%" height="81"><param name="movie" value="http://player.soundcloud.com/player.swf?url=http%3A%2F%2Fapi.soundcloud.com%2Ftracks%2F69148018&show_comments=true&auto_play=false&color=00ff15"></param><param name="allowscriptaccess" value="always"></param><embed width="100%" height="81" src="http://player.soundcloud.com/player.swf?url=http%3A%2F%2Fapi.soundcloud.com%2Ftracks%2F69148018&show_comments=true&auto_play=false&color=00ff15" allowscriptaccess="always" type="application/x-shockwave-flash"></embed></object>
<p>As Twitter user @DanieleTatti noted, it sounds like a sort of Scottish raga. But what I wanted to post about was the algorithm used to generate that ever changing sequence of pitches and warbles. It&#8217;s quite a simple idea &#8211; simple enough in fact that the whole piece is generated by the following 140 characters of <a href="http://supercollider.sourceforge.net/">SuperCollider</a> code:</p>
<pre>play{ar(r=RLPF,Saw.ar([200,302]).mean,5**(n={LFNoise1.kr(1/8)})*1e3,0.6)
+r.ar(Saw ar:Amplitude.kr(3**n*3e3*InFeedback.ar(0)+1,4,4),1e3)/5!2}</pre>
<p><span id="more-3581"></span></p>
<p>Lately, I&#8217;ve been reading about cybernetics (specifically, I&#8217;m reading the really rather excellent book <a href="http://www.amazon.co.uk/The-Cybernetic-Brain-Sketches-Another/dp/0226667901/">The Cybernetic Brain</a> by Andrew Pickering), and it&#8217;s got me thinking about homeostasis and feedback-based control systems. This code defines a drone consisting of two filtered sawtooth oscillators at fixed frequencies, as well as another filtered sawtooth oscillator whose frequency can vary. (I&#8217;ll call this one &#8220;the oscillator.&#8221;) The oscillator&#8217;s frequency is chosen by listening to the output sound (that is, the drone combined with the oscillator&#8217;s output), and choosing its frequency based on the overall amplitude of this signal. The amplitude-following mechanism (which is built into SuperCollider) has a bit of a time lag in it, so what happens is that when the drone and the oscillator are out of phase, the oscillator&#8217;s frequency will decrease over time, whereas when they&#8217;re in phase (and hence louder), the frequency increases.</p>
<p>For reasons I have some intuitive understanding of but would have a hard time showing mathematically, this tends to result in the system falling into stable states where the frequency of the drone and the oscillator are in a nice integer ratio. The phenomenon is presumably closely related to <a href="http://en.wikipedia.org/wiki/Entrainment_(physics)">oscillator entrainment</a>, which I&#8217;ve also experimented with musically, but which does&#8217;t tend to sound as good for some reason.</p>
<p>The human perception of sound is such that we find integer ratios to sound consonant, particularly when the integers involved are small. For example, a frequency ratio of 3/2 is a perfect fifth, and Pythagoras&#8217; discovery of this was in many ways the genesis of physics as a numerical science.</p>
<p>There are several sources of variation in this code, which prevent the oscillator from remaining in the same frequency all the time. The output of the amplitude tracker is multiplied by a slowly (pseudo-randomly) changing value before being sent to the oscillator&#8217;s frequency; but also the drone has some variation in it, and the system is sensitive to this as well. The drone has a slowly changing filter, and it consists of two tones at frequencies 200 and 302 Hz, which is slightly wider than a perfect fifth, so there&#8217;s a <a href="http://en.wikipedia.org/wiki/Beat_(acoustics)">beat frequency</a> of 2 Hz. Sometimes you can hear the oscillator doing a slow vibrato as it responds to these beat frequencies. It&#8217;s quite satisfying to listen to this piece knowing that the oscillator is &#8220;listening&#8221; to and responding to the sound of its output. This is very rarely the case with computer-generated music.</p>
<p>You can hear the oscillator&#8217;s frequency dropping to zero at the end. This is a response to me turning the volume down, but if I hadn&#8217;t done that the piece would have gone on forever.</p>
<p>It was quite a lot of work to fit this idea into 140 characters. I originally had a 200-ish character one that was a lot more complicated, but in the end I think I prefer this one.</p>
<p>As you might have guessed, creating 140-character pieces of music is a hobby of mine. I post them at unpredictable intervals on Twitter as @<a href="https://twitter.com/headcube">headcube</a>. I&#8217;ve only recently started posting audio clips of them, so if you want to listen to the full back catalogue (or to hear more than two minutes of this piece), you&#8217;ll have to install SuperCollider. It&#8217;s available for every platform, and if you&#8217;re on linux you should be able to get it via your package manager.</p>
<br />  <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/jellymatter.wordpress.com/3581/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/jellymatter.wordpress.com/3581/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=jellymatter.com&#038;blog=19982511&#038;post=3581&#038;subd=jellymatter&#038;ref=&#038;feed=1" width="1" height="1" />]]></content:encoded>
			<wfw:commentRss>http://jellymatter.com/2012/11/29/cybernetic-music/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
	
		<media:content url="http://0.gravatar.com/avatar/3d3d01f62eb81589b829f28c8c9f6cd0?s=96&#38;d=identicon&#38;r=G" medium="image">
			<media:title type="html">Nathaniel</media:title>
		</media:content>
	</item>
		<item>
		<title>&#8230; and so does Pink</title>
		<link>http://jellymatter.com/2012/10/17/and-so-does-pink/</link>
		<comments>http://jellymatter.com/2012/10/17/and-so-does-pink/#comments</comments>
		<pubDate>Wed, 17 Oct 2012 00:35:58 +0000</pubDate>
		<dc:creator>Lucas Wilkins</dc:creator>
				<category><![CDATA[Uncategorized]]></category>

		<guid isPermaLink="false">http://jellymatter.com/?p=3505</guid>
		<description><![CDATA[PURPLE EXISTS! Just look at it.<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=jellymatter.com&#038;blog=19982511&#038;post=3505&#038;subd=jellymatter&#038;ref=&#038;feed=1" width="1" height="1" />]]></description>
				<content:encoded><![CDATA[<h1 style="text-align:center;"><strong><span style="color:#800080;">PURPLE EXISTS!</span></strong></h1>
<p style="text-align:center;">Just look at it.</p>
<br />  <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/jellymatter.wordpress.com/3505/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/jellymatter.wordpress.com/3505/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=jellymatter.com&#038;blog=19982511&#038;post=3505&#038;subd=jellymatter&#038;ref=&#038;feed=1" width="1" height="1" />]]></content:encoded>
			<wfw:commentRss>http://jellymatter.com/2012/10/17/and-so-does-pink/feed/</wfw:commentRss>
		<slash:comments>7</slash:comments>
	
		<media:content url="http://1.gravatar.com/avatar/d2d876b3bbbb995d9d47e32a639a6533?s=96&#38;d=identicon&#38;r=G" medium="image">
			<media:title type="html">jellymatter</media:title>
		</media:content>
	</item>
		<item>
		<title>Paradoxes of probability theory: the two envelopes</title>
		<link>http://jellymatter.com/2012/09/25/paradoxes-of-probability-theory-the-two-envelopes/</link>
		<comments>http://jellymatter.com/2012/09/25/paradoxes-of-probability-theory-the-two-envelopes/#comments</comments>
		<pubDate>Tue, 25 Sep 2012 15:34:44 +0000</pubDate>
		<dc:creator>Nathaniel Virgo</dc:creator>
				<category><![CDATA[Opinion]]></category>
		<category><![CDATA[Paradoxes of probability theory]]></category>
		<category><![CDATA[bayesian]]></category>
		<category><![CDATA[paradoxes]]></category>
		<category><![CDATA[probability]]></category>

		<guid isPermaLink="false">http://jellymatter.com/?p=3448</guid>
		<description><![CDATA[This post is about a classic probability puzzle. It goes something like this: I place two envelopes on the table in front of you. One of them contains a Prize, which is an amount of money in pounds, but you don&#8217;t know how much it is. The other one contains a Special Bonus Prize, which [&#8230;]<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=jellymatter.com&#038;blog=19982511&#038;post=3448&#038;subd=jellymatter&#038;ref=&#038;feed=1" width="1" height="1" />]]></description>
				<content:encoded><![CDATA[<p>This post is about a classic probability puzzle. It goes something like this: I place two envelopes on the table in front of you. One of them contains a Prize, which is an amount of money in pounds, but you don&#8217;t know how much it is. The other one contains a Special Bonus Prize, which is worth exactly twice as much money as the Prize. It&#8217;s your lucky day &#8212; but you can only choose one envelope. Which do you choose?</p>
<p>&#8220;Well,&#8221; you say to yourself, &#8220;it doesn&#8217;t matter, they&#8217;re both the same,&#8221; so you pick one at random. Let&#8217;s say it&#8217;s the one on the left. But now I ask you if you want to change your mind.</p>
<p>&#8220;Well,&#8221; you might say to yourself, &#8220;let <em>x </em>be the amount of money in the envelope I&#8217;m holding. This envelope has a 50% chance of being the Prize, in which case the other envelope contains 2<em>x</em>. On the other hand, there&#8217;s a 50% chance that this is the Special Bonus Prize, in which case the other envelope contains 0.5<em>x</em>. But still, the expected value of the other envelope is 0.5*2<em>x</em> + 0.5*0.5<em>x</em> = 1.25<em>x</em>. So on the balance of probabilities I should definitely switch.&#8221; But then I offer to let you switch again, and again, and again, and every time you go through the same reasoning, never managing to settle on a particular envelope because each one seems like it should contain more money than the other.  Clearly something is wrong with this reasoning, but what is it?</p>
<p>In this post, I&#8217;ll solve this problem in what I consider to be the proper Bayesian way, pinpointing exactly where the problem is.  You might want to think about the question for a bit and come up with your own idea of its solution before reading on.</p>
<p><span id="more-3448"></span></p>
<p>One thing to note before we start is that the problem goes away if I tell you how much money the Prize is worth. For example, if the Prize is £10 and the Bonus Prize is £20, the reasoning goes like this: If the envelope in my hand is the Prize then the other envelope is worth £20, otherwise the other envelope is worth £10, so its expectation is £15. That&#8217;s the same as the expected value of the envelope in your hand, so there&#8217;s no problem. So whatever the difficulty is, it has something to do with the idea that the value of the Prize is unknown.</p>
<p>How can we model this problem using probability theory? I decided to use three jointly distributed random variables: <em>A</em>, which represents the value of the envelope on the left (call it envelope A); <em>B</em>, which represents the amount of money in the other envelope; and <em>E</em>, which can take on two values, <em>a</em> or <em>b</em>. The variable <em>E</em> represents whether envelope A or envelope B contains the Special Bonus Prize.</p>
<p>We have to come up with a prior distribution for these three variables. In order to make them represent the things they&#8217;re supposed to represent, this distribution must have the following properties:</p>
<ul>
<li>The marginal distributions must be the same for the amount of money in the two envelopes. That is, <img src='http://s0.wp.com/latex.php?latex=p%28A%3Dx%29+%3D+p%28B%3Dx%29&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='p(A=x) = p(B=x)' title='p(A=x) = p(B=x)' class='latex' />, for every <em>x</em>.</li>
<li>Each envelope should have a 50/50 chance of containing the Bonus Prize: <img src='http://s0.wp.com/latex.php?latex=p%28E%3Da%29+%3D+p%28E%3Db%29+%3D+%5Cfrac%7B1%7D%7B2%7D&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='p(E=a) = p(E=b) = &#92;frac{1}{2}' title='p(E=a) = p(E=b) = &#92;frac{1}{2}' class='latex' />.</li>
<li>The Bonus envelope should contain twice as much money as the Prize envelope: <img src='http://s0.wp.com/latex.php?latex=p%28A%3D2B%7CE%3Da%29%3D1&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='p(A=2B|E=a)=1' title='p(A=2B|E=a)=1' class='latex' /> and <img src='http://s0.wp.com/latex.php?latex=p%28B%3D2A%7CE%3Db%29%3D1&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='p(B=2A|E=b)=1' title='p(B=2A|E=b)=1' class='latex' />.</li>
<li>The joint distribution shouldn&#8217;t change if you swap the two envelopes: <img src='http://s0.wp.com/latex.php?latex=p%28A%3Dx%2C+B%3Dy%29+%3D+p%28B%3Dx%2C+A%3Dy%29&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='p(A=x, B=y) = p(B=x, A=y)' title='p(A=x, B=y) = p(B=x, A=y)' class='latex' />.</li>
</ul>
<p>These aren&#8217;t necessarily independent, since the last one implies the first one and possibly the first three imply the last one (I&#8217;m not sure), but they&#8217;ll all be used below.</p>
<p>Now, with this notation, let&#8217;s go through the paradoxical argument again. I&#8217;ve marked where the problem is.</p>
<ol>
<li>Envelope A has a 50% probability of being the Special Bonus envelope. <img src='http://s0.wp.com/latex.php?latex=p%28E%3Da%29%3D%5Cfrac%7B1%7D%7B2%7D&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='p(E=a)=&#92;frac{1}{2}' title='p(E=a)=&#92;frac{1}{2}' class='latex' />, or <img src='http://s0.wp.com/latex.php?latex=p%28B%3D2A%29%3Dp%28B%3D%5Cfrac%7B1%7D%7B2%7DA%29+%3D+%5Cfrac%7B1%7D%7B2%7D&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='p(B=2A)=p(B=&#92;frac{1}{2}A) = &#92;frac{1}{2}' title='p(B=2A)=p(B=&#92;frac{1}{2}A) = &#92;frac{1}{2}' class='latex' />.</li>
<li>(The incorrect step.) Let the contents of A be <em>x</em>. Then, from (1), envelope B contains 2<em>x</em> with probability 0.5, and 0.5<em>x </em>with probability 0.5. That is, <img src='http://s0.wp.com/latex.php?latex=p%28B%3D2x+%7C+A%3Dx%29+%3D+%5Cfrac%7B1%7D%7B2%7D&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='p(B=2x | A=x) = &#92;frac{1}{2}' title='p(B=2x | A=x) = &#92;frac{1}{2}' class='latex' /> and <img src='http://s0.wp.com/latex.php?latex=p%28B%3D%5Cfrac%7B1%7D%7B2%7Dx+%7C+A%3Dx%29+%3D+%5Cfrac%7B1%7D%7B2%7D&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='p(B=&#92;frac{1}{2}x | A=x) = &#92;frac{1}{2}' title='p(B=&#92;frac{1}{2}x | A=x) = &#92;frac{1}{2}' class='latex' />, for any given <em>x</em>.</li>
<li>Therefore the expected value of B is equal to <img src='http://s0.wp.com/latex.php?latex=%5Cfrac%7B1%7D%7B2%7D%282x%2B%5Cfrac%7B1%7D%7B2%7Dx%29%3D%5Cfrac%7B5%7D%7B4%7Dx&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;frac{1}{2}(2x+&#92;frac{1}{2}x)=&#92;frac{5}{4}x' title='&#92;frac{1}{2}(2x+&#92;frac{1}{2}x)=&#92;frac{5}{4}x' class='latex' />, where <em>x</em> is the value of envelope A.</li>
<li>Therefore the expected value of B is greater than the expected value of A, and I should switch</li>
<li>By symmetry I should also switch if I choose B, so by induction I should keep switching forever and become infinitely rich.</li>
</ol>
<p>Step 2 is the problem. It&#8217;s true that <img src='http://s0.wp.com/latex.php?latex=p%28B%3D2A%29%3Dp%28B%3D%5Cfrac%7B1%7D%7B2%7DA%29+%3D+%5Cfrac%7B1%7D%7B2%7D&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='p(B=2A)=p(B=&#92;frac{1}{2}A) = &#92;frac{1}{2}' title='p(B=2A)=p(B=&#92;frac{1}{2}A) = &#92;frac{1}{2}' class='latex' />, but step 2 (and hence the whole argument) hinges on  claiming that this is still the case when conditioned on the actual value of A. That is,</p>
<p><img src='http://s0.wp.com/latex.php?latex=p%28B%3D2x+%7C+A%3Dx%29+%3D+p%28B%3D%5Cfrac%7B1%7D%7B2%7Dx+%7C+A%3Dx%29+%3D+%5Cfrac%7B1%7D%7B2%7D&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='p(B=2x | A=x) = p(B=&#92;frac{1}{2}x | A=x) = &#92;frac{1}{2}' title='p(B=2x | A=x) = p(B=&#92;frac{1}{2}x | A=x) = &#92;frac{1}{2}' class='latex' /></p>
<p>for every <em>x</em>. But in fact <img src='http://s0.wp.com/latex.php?latex=p%28+B%3D2x+%7C+A%3Dx+%29&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='p( B=2x | A=x )' title='p( B=2x | A=x )' class='latex' /> is given by</p>
<p><img src='http://s0.wp.com/latex.php?latex=p%28+A%3Dx+%7C+E%3Da+%29%5Cfrac%7Bp%28+E%3Da+%29%7D%7Bp%28+A%3Dx+%29%7D+%3D+%5Cfrac%7Bp%28+A%3Dx+%7C+E%3Da+%29%7D%7B2p%28+A%3Dx+%29%7D%2C&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='p( A=x | E=a )&#92;frac{p( E=a )}{p( A=x )} = &#92;frac{p( A=x | E=a )}{2p( A=x )},' title='p( A=x | E=a )&#92;frac{p( E=a )}{p( A=x )} = &#92;frac{p( A=x | E=a )}{2p( A=x )},' class='latex' /></p>
<p>whereas <img src='http://s0.wp.com/latex.php?latex=p%28+B%3D%5Cfrac%7B1%7D%7B2%7Dx+%7C+A%3Dx+%29&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='p( B=&#92;frac{1}{2}x | A=x )' title='p( B=&#92;frac{1}{2}x | A=x )' class='latex' /> is given by</p>
<p><img src='http://s0.wp.com/latex.php?latex=p%28+A%3Dx+%7C+E%3Db+%29%5Cfrac%7Bp%28+E%3Db+%29%7D%7Bp%28+A%3Dx+%29%7D+%3D+%5Cfrac%7Bp%28+A%3Dx+%7C+E%3Db+%29%7D%7B2p%28+A%3Dx+%29%7D.&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='p( A=x | E=b )&#92;frac{p( E=b )}{p( A=x )} = &#92;frac{p( A=x | E=b )}{2p( A=x )}.' title='p( A=x | E=b )&#92;frac{p( E=b )}{p( A=x )} = &#92;frac{p( A=x | E=b )}{2p( A=x )}.' class='latex' /></p>
<p>Thus <img src='http://s0.wp.com/latex.php?latex=p%28+B%3D2x+%7C+A%3Dx+%29&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='p( B=2x | A=x )' title='p( B=2x | A=x )' class='latex' /> can only be equal to <img src='http://s0.wp.com/latex.php?latex=p%28+B%3D%5Cfrac%7B1%7D%7B2%7Dx+%7C+A%3Dx+%29&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='p( B=&#92;frac{1}{2}x | A=x )' title='p( B=&#92;frac{1}{2}x | A=x )' class='latex' /> if <img src='http://s0.wp.com/latex.php?latex=p%28+A%3Dx+%7C+E%3Da+%29+%3D+p%28+A%3Dx+%7C+E%3Db+%29&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='p( A=x | E=a ) = p( A=x | E=b )' title='p( A=x | E=a ) = p( A=x | E=b )' class='latex' />, for every <em>x</em>. That is, step 2 only works if <em>A</em> is conditionally independent of <em>E</em>.</p>
<p>In other words, it only works if knowing whether A is the bonus prize tells you nothing about how much money is in it. This should raise a red flag already &#8212; if I told you that the envelope you&#8217;d selected was, in fact, the bonus prize, it would be quite strange if you didn&#8217;t then expect it to contain more money.</p>
<p>But an argument one sometimes hears in regards to this thought experiment is that if you know nothing &#8212; literally nothing &#8212; about the value of A, then this will in fact be true. Let&#8217;s try to codify this idea mathematically and see where it leads us.</p>
<p>Now, it&#8217;s quite difficult to say, a priori, what it really means to &#8220;know nothing&#8221; about something in probability theory. There&#8217;s a whole theory of so-called ignorance priors in Bayesian probability theory, but they&#8217;re quite fiddly and subtle things, so I&#8217;m not going to start out by trying to construct one. Instead I&#8217;ll just accept the claim in the previous paragraph (that knowing nothing means <em>A</em> is conditionally independent of <em>E</em>) and see where it leads.</p>
<p>Now, with this assumption of conditional independence, we have that</p>
<p><img src='http://s0.wp.com/latex.php?latex=p%28A%3Dx%29+%3D+%5Cfrac%7B1%7D%7B2%7D%28p%28A%3Dx+%7C+E%3Da%29+%2B+p%28A%3Dx+%7C+E%3Db%29%29+%3D+p%28+A%3Dx%7CE%3Da+%29+%3D+p%28A%3Dx+%7C+E%3Db%29.&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='p(A=x) = &#92;frac{1}{2}(p(A=x | E=a) + p(A=x | E=b)) = p( A=x|E=a ) = p(A=x | E=b).' title='p(A=x) = &#92;frac{1}{2}(p(A=x | E=a) + p(A=x | E=b)) = p( A=x|E=a ) = p(A=x | E=b).' class='latex' />             (i)</p>
<p>But <img src='http://s0.wp.com/latex.php?latex=p%28+A%3Dx+%7C+E%3Da+%29+%3D+p%28+B+%3D+%5Cfrac%7B1%7D%7B2%7Dx+%7C+E%3Da+%29&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='p( A=x | E=a ) = p( B = &#92;frac{1}{2}x | E=a )' title='p( A=x | E=a ) = p( B = &#92;frac{1}{2}x | E=a )' class='latex' />. By similar reasoning to (i), this is also equal to <img src='http://s0.wp.com/latex.php?latex=p%28B+%3D+%5Cfrac%7B1%7D%7B2%7Dx%29&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='p(B = &#92;frac{1}{2}x)' title='p(B = &#92;frac{1}{2}x)' class='latex' />, and by exchange of A and B this is equal to <img src='http://s0.wp.com/latex.php?latex=p%28A+%3D+%5Cfrac%7B1%7D%7B2%7Dx%29&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='p(A = &#92;frac{1}{2}x)' title='p(A = &#92;frac{1}{2}x)' class='latex' />.</p>
<p>So we have that this particular notion of &#8220;literally not knowing anything&#8221; implies that the marginal prior for <em>A</em> has the property that</p>
<p><img src='http://s0.wp.com/latex.php?latex=p%28+A%3Dx+%29+%3D+p%28+A%3D%5Cfrac%7B1%7D%7B2%7Dx+%29%2C&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='p( A=x ) = p( A=&#92;frac{1}{2}x ),' title='p( A=x ) = p( A=&#92;frac{1}{2}x ),' class='latex' /></p>
<p>for every <em>x</em>. You can construct various fancy priors that have this property, such as <img src='http://s0.wp.com/latex.php?latex=p%28A%3Dx%29+%5Cpropto+1%2B%5Csin%282%5Cpi%5Clog_2%28x%29%29&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='p(A=x) &#92;propto 1+&#92;sin(2&#92;pi&#92;log_2(x))' title='p(A=x) &#92;propto 1+&#92;sin(2&#92;pi&#92;log_2(x))' class='latex' />, but the one that looks most like an ignorance prior is the uniform prior. Uniform priors for unbounded quantities are a bit odd and have a few formal subtleties, but you can deal with them. They&#8217;re improper priors, meaning that you can&#8217;t normalise them, but essentially, this ignorance prior assigns the same infinitesimal probability density to every value of <em>x</em>.</p>
<p>Using such a uniform prior for an amount of money is a bit weird. It means, for example, that the probability that the envelope contains £10 is the same as the probability that it contains £<img src='http://s0.wp.com/latex.php?latex=10%5E%7B10%5E%7B100%7D%7D&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='10^{10^{100}}' title='10^{10^{100}}' class='latex' />, which is the same as the probability that it contains £<img src='http://s0.wp.com/latex.php?latex=10%5E%7B-1000%7D&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='10^{-1000}' title='10^{-1000}' class='latex' />. But more than that, it means that the expected amount of money in the envelope is infinite. (This is also true of the &#8220;fancy&#8221; priors mentioned above.)</p>
<p>So now, finally, we can fully explain the paradox. If you really really &#8220;didn&#8217;t know anything&#8221; about the amount of money in the envelopes, you might be justified in assigning a uniform marginal prior to the value of each envelope&#8217;s contents. Then when you select envelope A, you can ask yourself &#8220;what is the expected amount of money in this envelope?&#8221; The answer is infinity. Should you switch? Well, the expected amount of money in envelope B is infinity, which is equal to infinity*5/4. So there&#8217;s no paradox. It doesn&#8217;t matter if you switch or not, because you&#8217;ll expect to become infinitely rich in any case.</p>
<p>But of course, if this was a real situation then the uniform prior would be a rather silly one. The contents of my envelopes cannot be less than 1p, and they can&#8217;t be more than the total amount of money in existence, which I guess is in the trillions of pounds. (Of course, you&#8217;re free to pick a smaller upper bound if you want.) Any prior you can come up with that fulfils these constraints will have finite expectations, and won&#8217;t allow you to conclude step 2 from step 1 in the argument above.</p>
<p>So to conclude, the two envelopes paradox is not a paradox at all, but just an intuitively reasonable argument that has a hard-to-spot error. (Confusing a conditional distribution with an unconditional one.) If you work through the problem in a proper Bayesian fashion, you realise that you can&#8217;t avoid considering your prior knowledge of the envelopes&#8217; contents. As long as you choose a sensible prior, the problem evaporates.</p>
<br />  <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/jellymatter.wordpress.com/3448/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/jellymatter.wordpress.com/3448/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=jellymatter.com&#038;blog=19982511&#038;post=3448&#038;subd=jellymatter&#038;ref=&#038;feed=1" width="1" height="1" />]]></content:encoded>
			<wfw:commentRss>http://jellymatter.com/2012/09/25/paradoxes-of-probability-theory-the-two-envelopes/feed/</wfw:commentRss>
		<slash:comments>11</slash:comments>
	
		<media:content url="http://0.gravatar.com/avatar/3d3d01f62eb81589b829f28c8c9f6cd0?s=96&#38;d=identicon&#38;r=G" medium="image">
			<media:title type="html">Nathaniel</media:title>
		</media:content>
	</item>
		<item>
		<title>More wrong interpretations of P values &#8211; &#8220;repeated sampling&#8221;</title>
		<link>http://jellymatter.com/2012/09/24/more-wrong-interpretations-of-p-values-repeated-sampling/</link>
		<comments>http://jellymatter.com/2012/09/24/more-wrong-interpretations-of-p-values-repeated-sampling/#comments</comments>
		<pubDate>Mon, 24 Sep 2012 12:39:55 +0000</pubDate>
		<dc:creator>James Thorniley</dc:creator>
				<category><![CDATA[Actual Science]]></category>
		<category><![CDATA[Fisher]]></category>
		<category><![CDATA[null hypothesis testing]]></category>
		<category><![CDATA[P values]]></category>
		<category><![CDATA[statistics]]></category>

		<guid isPermaLink="false">http://jellymatter.com/?p=3365</guid>
		<description><![CDATA[A while ago I wrote a little rant on the (mis)interpretation of P-values. I&#8217;d like to return to this subject having investigated a little more. First, this post, I&#8217;m going to point to an interesting little subtlety pointed out by Fisher that I hadn&#8217;t thought about before, in the second post, I will argue why [&#8230;]<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=jellymatter.com&#038;blog=19982511&#038;post=3365&#038;subd=jellymatter&#038;ref=&#038;feed=1" width="1" height="1" />]]></description>
				<content:encoded><![CDATA[<p>A while ago I wrote a little <a href="http://jellymatter.com/2011/12/05/when-should-you-use-a-null-hypothesis-test-probably-never/">rant</a> on the (mis)interpretation of P-values. I&#8217;d like to return to this subject having investigated a little more. First, this post, I&#8217;m going to point to an interesting little subtlety pointed out by <a href="http://www.jstor.org/stable/2983785">Fisher</a> that I hadn&#8217;t thought about before, in the second post, I will argue why P-values aren&#8217;t as bad as they are sometimes made out to be.</p>
<p>So, last time, I stressed the point that you can&#8217;t interpret a P-value as a probability or frequency of anything, unless you say &#8220;given that the null hypothesis is true&#8221;. Most misinterpretations, e.g. &#8220;the probability that you would accept the null hypothesis if you tried the experiment again&#8221;, make this error. But there is one common interpretation that is less obviously false: &#8220;A P-value is the probability that the data would deviate as or more strongly from the null hypothesis in another experiment, than they did in the current experiment, given that the null hypothesis is true&#8221;. This is something that you might think is a more careful statement, but the problem is that in fact when we calculate P values we take into account aspects of the data not necessarily related to how strongly they deviate from the prediction of the null hypothesis. This could be misleading, so we&#8217;ll build it up more precisely in this post.</p>
<p><span id="more-3365"></span></p>
<p>To begin with, let&#8217;s make up some symbolic language. &#8220;The data&#8221; are taken to be a random variable <img src='http://s0.wp.com/latex.php?latex=E&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='E' title='E' class='latex' />, and the particular instance of the data observed in the current experiment <img src='http://s0.wp.com/latex.php?latex=e&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='e' title='e' class='latex' />. This evidence itself is likely to consists of a number of observations, e.g. a collection of real values <img src='http://s0.wp.com/latex.php?latex=x_1%2C+x_2&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='x_1, x_2' title='x_1, x_2' class='latex' /> etc. The null hypothesis <img src='http://s0.wp.com/latex.php?latex=H_0&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='H_0' title='H_0' class='latex' /> will specify a value for some parameter of interest, call it <img src='http://s0.wp.com/latex.php?latex=%5Cbeta&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;beta' title='&#92;beta' class='latex' />. For example:</p>
<p><img src='http://s0.wp.com/latex.php?latex=H_0+%5Cimplies+%5Cbeta+%3D+0&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='H_0 &#92;implies &#92;beta = 0' title='H_0 &#92;implies &#92;beta = 0' class='latex' /></p>
<p>Generally, in order to investigate this parameter, we find a &#8220;sufficient statistic&#8221;, that is, a function(al) that operates on the observed data <img src='http://s0.wp.com/latex.php?latex=e&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='e' title='e' class='latex' /> to produce (usually) a single real number that contains all the relevant information that allows the data to discriminate between different possible values of the parameter of interest. We&#8217;ll call this <img src='http://s0.wp.com/latex.php?latex=R&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='R' title='R' class='latex' />, so for example we might use <img src='http://s0.wp.com/latex.php?latex=R%28e%29+%3D+%5Cbar%7Bx%7D&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='R(e) = &#92;bar{x}' title='R(e) = &#92;bar{x}' class='latex' /> &#8211; the arithmetic mean of the data values produced by an experiment (if the parameter of interest is the expectation value of the process that is being investigated). This statistic has a &#8220;sampling distribution&#8221;, i.e. let <img src='http://s0.wp.com/latex.php?latex=R%28E%29&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='R(E)' title='R(E)' class='latex' /> be a random variable representing the randomness of the <em>statistic</em> rather than the data itself.</p>
<p>Now we need to say what &#8220;deviate as or more strongly&#8221; means. For this, assume there is a distance function <img src='http://s0.wp.com/latex.php?latex=d&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='d' title='d' class='latex' /> which describes how far the statistic if from the expectation under the null hypothesis. So if the null hypothesis predicts a zero population mean or expectation and <img src='http://s0.wp.com/latex.php?latex=R%28e%29&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='R(e)' title='R(e)' class='latex' /> is the sample mean from the experiment, we could have:</p>
<p><img src='http://s0.wp.com/latex.php?latex=d%28R%28e%29%29+%3D+%7CR%28e%29%7C&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='d(R(e)) = |R(e)|' title='d(R(e)) = |R(e)|' class='latex' /></p>
<p>(i.e. the absolute value of the statistic). The null hypothesis says that the parameter is zero, so it implies that the statistic (in this case an estimator of the parameter) is also (expected to be) zero &#8211; we just take how far the statistic is from zero as a measure of deviation. Then, we can encode the above interpretation of the P-value symbolically:</p>
<p><img src='http://s0.wp.com/latex.php?latex=%5CPr%28+d%28R%28E%29%29+%5Cge+d%28R%28e%29%29+%7C+H_0+%29&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;Pr( d(R(E)) &#92;ge d(R(e)) | H_0 )' title='&#92;Pr( d(R(E)) &#92;ge d(R(e)) | H_0 )' class='latex' /></p>
<p>&#8220;The probability that the distance of the statistic from the prediction is greater than the distance observed, given that the null hypothesis is true&#8221;.</p>
<p>By the law of large numbers, this could be read as the relative frequency with which the statistic would take a more extreme value than the one observed if we repeat the experiment many times on a process where <img src='http://s0.wp.com/latex.php?latex=H_0&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='H_0' title='H_0' class='latex' /> applies. This might look fairly reasonable, so what is wrong? In fact, for many tests, it might be right, but I&#8217;ll borrow an example from <a href="http://www.jstor.org/stable/2983785">Fisher</a> of a case where it&#8217;s not.</p>
<p>Suppose we are doing a linear regression on two Gaussian random variables. We take a model with <img src='http://s0.wp.com/latex.php?latex=Y&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='Y' title='Y' class='latex' /> and <img src='http://s0.wp.com/latex.php?latex=X&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='X' title='X' class='latex' /> as two random variables:</p>
<p><img src='http://s0.wp.com/latex.php?latex=y+%3D+%5Calpha+%2B+%5Cbeta+x+%2B+%5Cvarepsilon&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='y = &#92;alpha + &#92;beta x + &#92;varepsilon' title='y = &#92;alpha + &#92;beta x + &#92;varepsilon' class='latex' /></p>
<p>The parameter <img src='http://s0.wp.com/latex.php?latex=%5Calpha&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;alpha' title='&#92;alpha' class='latex' /> represents an offset which is more or less irrelevant for our purposes. The &#8220;slope&#8221; <img src='http://s0.wp.com/latex.php?latex=%5Cbeta&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;beta' title='&#92;beta' class='latex' /> is the strength of the relation, and will be the subject of our hypothesis test. There is also some random deviation <img src='http://s0.wp.com/latex.php?latex=%5Cvarepsilon&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;varepsilon' title='&#92;varepsilon' class='latex' /> such that <img src='http://s0.wp.com/latex.php?latex=Y&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='Y' title='Y' class='latex' /> is distributed marginally (for any given value of <img src='http://s0.wp.com/latex.php?latex=x&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='x' title='x' class='latex' />) as a Gaussian with standard deviation <img src='http://s0.wp.com/latex.php?latex=%5Csigma&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;sigma' title='&#92;sigma' class='latex' />.</p>
<p>We do an experiment and obtain <img src='http://s0.wp.com/latex.php?latex=N&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='N' title='N' class='latex' /> pairs of values, <img src='http://s0.wp.com/latex.php?latex=e+%3D+%28+%28x_1%2Cy_1%29%2C%28x_2%2Cy_2%29%2C...%28x_N%2Cy_N%29+%29&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='e = ( (x_1,y_1),(x_2,y_2),...(x_N,y_N) )' title='e = ( (x_1,y_1),(x_2,y_2),...(x_N,y_N) )' class='latex' />. Presenting the regression in the same way as Fisher, let&#8217;s define three intermediate statistics:</p>
<p><img src='http://s0.wp.com/latex.php?latex=C_x+%3D+%5Csum+%28x_i-%5Cbar%7Bx%7D%29%5E2&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='C_x = &#92;sum (x_i-&#92;bar{x})^2' title='C_x = &#92;sum (x_i-&#92;bar{x})^2' class='latex' /></p>
<p><img src='http://s0.wp.com/latex.php?latex=C_%7Bxy%7D+%3D+%5Csum+%28x_i-%5Cbar%7Bx%7D%29%28y_i-%5Cbar%7By%7D%29&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='C_{xy} = &#92;sum (x_i-&#92;bar{x})(y_i-&#92;bar{y})' title='C_{xy} = &#92;sum (x_i-&#92;bar{x})(y_i-&#92;bar{y})' class='latex' /></p>
<p><img src='http://s0.wp.com/latex.php?latex=C_y+%3D+%5Csum+%28y_i-%5Cbar%7By%7D%29%5E2&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='C_y = &#92;sum (y_i-&#92;bar{y})^2' title='C_y = &#92;sum (y_i-&#92;bar{y})^2' class='latex' /></p>
<p>Say that the null hypothesis says <img src='http://s0.wp.com/latex.php?latex=%5Cbeta+%3D+0&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;beta = 0' title='&#92;beta = 0' class='latex' /> again &#8211; the data are uncorrelated. A suitable estimator for <img src='http://s0.wp.com/latex.php?latex=%5Cbeta&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;beta' title='&#92;beta' class='latex' /> is:</p>
<p><img src='http://s0.wp.com/latex.php?latex=R%28e%29+%3D+b+%3D+%7BC_x%7D%2FC_%7Bxy%7D&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='R(e) = b = {C_x}/C_{xy}' title='R(e) = b = {C_x}/C_{xy}' class='latex' /></p>
<p>Of course the null hypothesis predicts <img src='http://s0.wp.com/latex.php?latex=b%3D0&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='b=0' title='b=0' class='latex' />, so a reasonable distance value is <img src='http://s0.wp.com/latex.php?latex=%7Cb%7C&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='|b|' title='|b|' class='latex' />. According to the &#8220;misinterpretation&#8221;, the P value is</p>
<p><img src='http://s0.wp.com/latex.php?latex=%5CPr%28+%7CB%7C+%5Cge+%7Cb%7C+%7C+H_0%29&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;Pr( |B| &#92;ge |b| | H_0)' title='&#92;Pr( |B| &#92;ge |b| | H_0)' class='latex' /></p>
<p>Where <img src='http://s0.wp.com/latex.php?latex=B&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='B' title='B' class='latex' /> is the estimator statistic as a random variable (i.e. representing the values the estimator would take if you sampled over and over from a system where the two variables were uncorrelated).</p>
<p>But in fact, the usual way to test this null hypothesis is to do a t-test. This involves first getting the estimator for <img src='http://s0.wp.com/latex.php?latex=%5Csigma%5E2&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;sigma^2' title='&#92;sigma^2' class='latex' />:</p>
<p><img src='http://s0.wp.com/latex.php?latex=s%5E2+%3D+%28C_y-C_%7Bxy%7D%5E2%2FC_x%29+%2F+%28N-2%29&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='s^2 = (C_y-C_{xy}^2/C_x) / (N-2)' title='s^2 = (C_y-C_{xy}^2/C_x) / (N-2)' class='latex' /></p>
<p>Then calculating the t-value:</p>
<p><img src='http://s0.wp.com/latex.php?latex=t+%3D+b+%5Csqrt%7BC_x%7D%2Fs&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='t = b &#92;sqrt{C_x}/s' title='t = b &#92;sqrt{C_x}/s' class='latex' /></p>
<p>And the P-value is the probability of the <img src='http://s0.wp.com/latex.php?latex=t&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='t' title='t' class='latex' /> observed or one more extreme given that null hypothesis is correct:</p>
<p><img src='http://s0.wp.com/latex.php?latex=P+%3D+2+%281+-+F_t%28%7Ct%7C%3BN-2%29%29&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='P = 2 (1 - F_t(|t|;N-2))' title='P = 2 (1 - F_t(|t|;N-2))' class='latex' /></p>
<p>Where <img src='http://s0.wp.com/latex.php?latex=F_t%28x%3B%5Clambda%29&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='F_t(x;&#92;lambda)' title='F_t(x;&#92;lambda)' class='latex' /> is the cumulative density function of the t-distribution with <img src='http://s0.wp.com/latex.php?latex=%5Clambda&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;lambda' title='&#92;lambda' class='latex' /> degrees of freedom. We multiply by 2 because we want a &#8220;two-tailed&#8221; test. But because t is calculated using the actual value of <img src='http://s0.wp.com/latex.php?latex=C_x&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='C_x' title='C_x' class='latex' /> (observed in the experiment), it is dependent on this actual value, even though it would not be constant across all experiments. That is, the <img src='http://s0.wp.com/latex.php?latex=P&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='P' title='P' class='latex' /> value represents the probability that the test statistic would take a more extreme value under the null hypothesis:</p>
<p><img src='http://s0.wp.com/latex.php?latex=P+%3D+%5CPr%28+%7CT%7C+%5Cge+%7Ct%7C+%7C+H_0+%29&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='P = &#92;Pr( |T| &#92;ge |t| | H_0 )' title='P = &#92;Pr( |T| &#92;ge |t| | H_0 )' class='latex' /></p>
<p>We can see how this is different from the probability that the data deviate more strongly from the null with a little computational experiment (<a href="http://pastebin.com/nR15Qs45">Python code</a>). First, generate N data points using a Gaussian random number generator (so X and Y are genuinely two independent random variables). Then calculate the <img src='http://s0.wp.com/latex.php?latex=b&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='b' title='b' class='latex' /> and <img src='http://s0.wp.com/latex.php?latex=P&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='P' title='P' class='latex' /> values as above. Then, if we re-run the process (generate lots, say 500, of sets of N uncorrelated data), the original (mis)interpretation of the P value says that P should be the proportion of the time that <img src='http://s0.wp.com/latex.php?latex=b&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='b' title='b' class='latex' /> takes a larger absolute value in one of the re-runs than it did in the original experiment.</p>
<p>If we repeat this process a few times, we can plot the P values obtained against the proportion of the time that b took a greater value in one of the re-runs than it did in the original experiment:</p>
<p><a href="http://jellymatter.files.wordpress.com/2012/09/repeats1.png"><img class="alignnone size-medium wp-image-3428" title="P value versus more extreme data" src="http://jellymatter.files.wordpress.com/2012/09/repeats1.png?w=300&#038;h=225" alt="P value versus more extreme data" width="300" height="225" /></a></p>
<p>If it was the case that the P value was the proportion of the time that the data would be observed to deviate more strongly from the prediction under the null hypothesis, the points should lie very close to the diagonal &#8211; but they don&#8217;t. I&#8217;ve also coloured the points blue if in the original experiment, <img src='http://s0.wp.com/latex.php?latex=C_x&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='C_x' title='C_x' class='latex' /> was very low (less than 14) and red if it was very high (more than 27 &#8211; the values were chosen empirically). Notice that the blue values tend to lie below the diagonal, and the red values above it &#8211; so you can see that the probability of observing more extreme data is dependent not only on the P value, but on the value of <img src='http://s0.wp.com/latex.php?latex=C_x&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='C_x' title='C_x' class='latex' /> in the original experiment.</p>
<p>So to sum up, we have a few different kinds of statistics:</p>
<ul>
<li>The estimator for the relevant parameter &#8211; i.e. a sufficient statistic for the parameter of interest in the hypothesis</li>
<li>Ancillary statistics &#8211; give information about parameters not relevant to the null hypothesis (e.g. the variance in the linear regression model)</li>
<li>Test statistics &#8211; make use of the relevant statistic and potentially ancillary statistics.</li>
</ul>
<p>So the P value is in fact conditional not only on the null hypothesis, but on the ancillary statistics:</p>
<p>P = Pr( data more extreme than observed | <img src='http://s0.wp.com/latex.php?latex=H_0&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='H_0' title='H_0' class='latex' />, ancillary statistics as observed )</p>
<p>Fisher points out that of course this can&#8217;t reasonably be interpreted as any mechanical procedure of repeated sampling, since it would imply that you would have to select for those samples where the ancillary statistics were exactly the same as the one you observed in the original test (which is never going to happen).</p>
<p>I&#8217;m not going to draw any grand conclusions from this. I&#8217;m not even quite sure what to make of it at the moment. However, it does seem to me like quite an important point if you want to understand what a P value is.</p>
<br />  <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/jellymatter.wordpress.com/3365/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/jellymatter.wordpress.com/3365/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=jellymatter.com&#038;blog=19982511&#038;post=3365&#038;subd=jellymatter&#038;ref=&#038;feed=1" width="1" height="1" />]]></content:encoded>
			<wfw:commentRss>http://jellymatter.com/2012/09/24/more-wrong-interpretations-of-p-values-repeated-sampling/feed/</wfw:commentRss>
		<slash:comments>14</slash:comments>
	
		<media:content url="http://2.gravatar.com/avatar/8c84fe8613248465600ad95e94393d40?s=96&#38;d=identicon&#38;r=G" medium="image">
			<media:title type="html">jamesthorniley</media:title>
		</media:content>

		<media:content url="http://jellymatter.files.wordpress.com/2012/09/repeats1.png?w=300" medium="image">
			<media:title type="html">P value versus more extreme data</media:title>
		</media:content>
	</item>
		<item>
		<title>Randomised controlled trials &#8211; the &#8220;gold standard&#8221;?</title>
		<link>http://jellymatter.com/2012/08/29/randomised-controlled-trials-the-gold-standard/</link>
		<comments>http://jellymatter.com/2012/08/29/randomised-controlled-trials-the-gold-standard/#comments</comments>
		<pubDate>Wed, 29 Aug 2012 10:24:07 +0000</pubDate>
		<dc:creator>James Thorniley</dc:creator>
				<category><![CDATA[Opinion]]></category>
		<category><![CDATA[adaptation]]></category>
		<category><![CDATA[Cartwright]]></category>
		<category><![CDATA[causality]]></category>
		<category><![CDATA[counterfactuals]]></category>
		<category><![CDATA[Goldacre]]></category>
		<category><![CDATA[philosophy of science]]></category>
		<category><![CDATA[RCT]]></category>
		<category><![CDATA[Taleb]]></category>

		<guid isPermaLink="false">http://jellymatter.com/?p=3077</guid>
		<description><![CDATA[The UK government&#8217;s (ever so slightly creepily named) &#8220;Behavioural Insights Team&#8221; released a report [PDF] (relatively) recently called &#8220;Test, Learn, Adapt&#8221; (the authors include Ben Goldacre, well known for the book &#8220;Bad Science&#8221;, and the director of the York Trials Unit, David Torgerson) arguing that more policy decisions should be made on the basis of [&#8230;]<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=jellymatter.com&#038;blog=19982511&#038;post=3077&#038;subd=jellymatter&#038;ref=&#038;feed=1" width="1" height="1" />]]></description>
				<content:encoded><![CDATA[<p>The UK government&#8217;s (ever so slightly creepily named) &#8220;Behavioural Insights Team&#8221; released a report <a href="http://www.cabinetoffice.gov.uk/sites/default/files/resources/TLA-1906126.pdf">[PDF]</a> (relatively) recently called &#8220;Test, Learn, Adapt&#8221; (the authors include Ben Goldacre, well known for the book &#8220;Bad Science&#8221;, and the director of the York Trials Unit, David Torgerson) arguing that more policy decisions should be made on the basis of evidence from randomised controlled trials (RCTs). The report is a really good plain-English explanation of what RCTs are and how they work. It also gives examples of how RCTs can perhaps help to inform policies, by testing whether interventions such as back-to-work schemes or educational programs, um, &#8220;work&#8221;. According to the report&#8217;s <a href="http://www.cabinetoffice.gov.uk/resource-library/test-learn-adapt-developing-public-policy-randomised-controlled-trials">blurb</a>:</p>
<blockquote><p>RCTs are the best way of determining whether a policy or intervention is working.</p></blockquote>
<p>It&#8217;s not hard to find opinion pieces backing up the report&#8217;s central idea, and the thesis that RCTs are the best way to &#8220;find things out&#8221;. Here&#8217;s <a href="http://timharford.com/2012/06/why-real-life-needs-real-trials/"> one by Tim Harford</a>, a writer who covers economics; <a href="http://www.guardian.co.uk/society/joepublic/2011/sep/27/new-public-policies-test">a similar argument</a> made by Paul Johnson who is the director of an economics research group, the Institute for Fiscal Studies; and <a href="http://blogs.nature.com/soapboxscience/2012/08/22/test-learn-adapt-a-scientific-approach-to-public-policy?WT.mc_id=TWT_NatureBlogs&amp;utm_source=buffer&amp;buffer_share=d6e83">Prateek Buch</a>, who is a research scientist. A phrase that keeps popping up is &#8220;gold standard&#8221;. RCTs are &#8220;the gold standard in evidence&#8221;, says Johnson, or the &#8220;gold-standard for showing that medical interventions are effective&#8221; according to Buch. Mark Henderson&#8217;s book, &#8220;The Geek Manifesto&#8221; <a href="http://books.google.co.uk/books?id=lIbbh0DcmfUC&amp;lpg=PT73&amp;vq=gold%20standard&amp;pg=PT73#v=onepage&amp;q&amp;f=false">says</a> that the RCT is &#8220;commonly considered the &#8216;gold standard&#8217; for medical research because it seeks systematically to minimise potential bias through a series of simple safeguards&#8221;. What exactly does all this mean? I think it&#8217;s a question worth asking, since not all science involves RCTs. The Higgs boson for example, was recently &#8220;discovered&#8221; (if that&#8217;s the word) without (as far as I can tell) the need to randomise test subjects. So are RCTs in fact the &#8220;gold standard&#8221;?</p>
<p><span id="more-3077"></span></p>
<p>A little searching reveals I&#8217;m far from the first person to ask this question. An article by Nancy Cartwright: <a href="http://personal.lse.ac.uk/cartwrig/PapersOnEvidence/Are%20RCTs%20the%20gold%20standard.pdf">&#8220;Are RCTs the Gold Standard?&#8221;</a>, addresses exactly this, as does to an extent her book <a href="http://books.google.co.uk/books/about/Hunting_Causes_and_Using_Them.html?id=HRgVd6QCEikC&amp;redir_esc=y">&#8220;Hunting Causes and Using Them&#8221;</a>. I think Cartwright makes some good points, so I&#8217;ll use these as a jumping off point. First, though, there&#8217;s one obvious respect in which RCTs can be considered the &#8220;gold standard&#8221; which Cartwright doesn&#8217;t directly address (perhaps because it&#8217;s too obvious), which is simply that RCTs are the gold standard because everyone already agrees they are. In medical research (as Buch says), RCTs are used all the time. But everybody agreeing isn&#8217;t really the kind of reason for using RCTs that I&#8217;m looking for.</p>
<p>For Cartwright, the reason RCTs are interesting is that they tell us about <em>causes</em>. One way of putting this would be that an RCT helps to address the question (if we&#8217;re thinking about testing a pill) &#8220;the patient got better after taking the pill, but would not have got better had they not taken the pill&#8221;. That is, rather than simply talking about correlations (the patients that took the pill got better, the ones that didn&#8217;t, didn&#8217;t), RCTs can potentially answer &#8220;counterfactual&#8221; questions about what would have happened if patients who (in reality) took the pill had not done so (hypothetically, in imaginationland). This can sound complicated, but I think it looks like a reasonable definition of &#8220;causation&#8221; to most people (Cartwright doesn&#8217;t use this definition, but I think it&#8217;s intuitive and it will do for our purposes).</p>
<p>Now, RCTs (according to Cartwright) guarantee that certain causal conclusions follow deductively given the assumptions of the RCT design and a positive result in the trial &#8211; the conclusion (i.e. that say, a pill &#8220;works&#8221; in the counterfactual sense discussed above) follows logically, you can probably intuit roughly what the logic is if you read &#8220;Test, Learn, Adapt&#8221;. This makes RCTs, as a method of testing causal claims, &#8220;clinchers&#8221;, rather than &#8220;vouchers&#8221; &#8211; that would be bits of evidence often considered &#8220;softer&#8221; that might be consistent with a particular causal claim, but don&#8217;t necessarily disprove any others.</p>
<p>There are two main issues that Cartwright points out with saying that RCTs are the gold standard. One is that, generally, there are a lot of possible &#8220;clinchers&#8221; that aren&#8217;t RCTs. Anything where the causal conclusion follows deductively from the assumptions and the test result. This not only applies to the &#8220;ideal RCT&#8221; but to many other methods (Cartwright gives a list, I might go in to it in another post). Of course to make the causal conclusion we have to be sure that the assumptions are met, and one could argue that RCTs offer a way of ensuring this. But they certainly don&#8217;t guarantee it.</p>
<p>Secondly, &#8220;vouchers&#8221; have a certain advantage over &#8220;clinchers&#8221;, namely that of external validity &#8211; it is hard (even with an RCT) to be sure that a conclusion obtained on a test population is correct for another target population. If you test a pill on some people who are alive today, how do you know that it will still work on people who are alive tomorrow? Or in a year, a decade or a century? Our understanding of human physiology suggests we aren&#8217;t going to be tricked by a pill that only works on Tuesdays, but what about testing government policies (such as back-to-work schemes which are discussed in the report I originally cited)? How do we know people will react to being put on a given scheme the same way this year as next? The extrapolation seems much more shaky &#8211; you can&#8217;t just assume that even as economic and social factors change, everyone&#8217;s behaviour stays the same, but equally you can&#8217;t do a new RCT on every policy every year, since that would be, well, silly. RCTs have no (intrinsic) way of dealing with this problem (which is not to say it is not something that cannot be analysed quantitatively, it&#8217;s just that the RCT methodology doesn&#8217;t fix it for you).</p>
<p>RCTs are nice, they make a lot of sense, and they are useful in many settings. But to say they are the &#8220;gold standard&#8221; smacks of received wisdom. What this report and the many comments on it ignore (or gloss over) is that RCTs rely on satisfying assumptions (internal validity) and extrapolating from limited evidence (external validity) just as much as anything else in science. That doesn&#8217;t make RCTs <em>bad</em> (far from it), it&#8217;s just that the &#8220;gold standard&#8221; claim seems to be based on the assumption that RCTs automatically produce the &#8220;right&#8221; result, when that only happens if you are careful about how you carry out the experiment, and the conclusions you draw from it. Cartwright again:</p>
<blockquote><p>But this introduces <em>expert judgment</em> into the assessment of internal validity, which RCT advocates tend to despise. Without expert judgment, however, the claims that the requisite assumptions for the RCT to be internally valid are met depend on fallible mechanical procedures. Expert judgments are naturally fallible too, but to rely on mechanics without experts to watch for where failures occur makes the entire proceeding unnecessarily dicey.</p></blockquote>
<p>The italics are in the original, and Cartwright has a point about expert judgment, and RCT advocates&#8217; views towards it. But I don&#8217;t think anyone is suggesting you can run RCTs without expert oversight, as &#8220;Test, Learn, Adapt&#8221; says:</p>
<blockquote><p>RCTs in their simplest form are very straightforward to run. However there are some hidden pitfalls which mean that some <em>expert support</em> is advisable at the outset. [Emphasis added]</p></blockquote>
<p>Personally, I think part of the reason why you can&#8217;t really avoid some kind of expert judgment is partly because causal questions tend to involve thinking about what <em>would</em> have happened &#8220;in imaginationland&#8221;, i.e. the counterfactual question I used to define cause above. Its hard to know what imaginationland looks like in any given context, so we inevitably end up just asking an &#8220;expert&#8221;. But RCT advocates don&#8217;t want us to talk to experts (apart from RCT experts). &#8220;Test, Learn, Adapt&#8221; encourages us to read &#8220;<a href="http://books.google.co.uk/books/about/The_Black_Swan.html?id=7wMuF4A4XF8C&amp;redir_esc=y">The Black Swan</a>&#8220;[1] by Nassim Nicholas Taleb, and gives this interpretation:</p>
<blockquote><p>Similarly, such thinkers tend to be sceptical about the ability of even the wisest experts and leaders to offer a comprehensive strategy or masterplan detailing ʻtheʼ best practice or answer on the ground (certainly on a universal basis). Instead they urge the deliberate nurturing of variation coupled with systems, or dynamics, that squeeze out less effective variations and reward and expand those variations that seem to work better.</p></blockquote>
<p>Ok, when I said it was &#8220;plain-English&#8221;, I might have been being generous in places.</p>
<p>[1] Having read it I can safely say that Taleb <em>really</em> doesn&#8217;t like experts. There&#8217;s the odd good idea in this book (though they are mostly borrowed from Benoit Mandelbrot among others), but it&#8217;s hard work getting past the author&#8217;s ego, baseless attacks on anyone who ever does any maths or who wears clothes Taleb doesn&#8217;t approve of, and constant reminders that the author is oh so clever. I don&#8217;t recommend it.</p>
<br />  <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/jellymatter.wordpress.com/3077/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/jellymatter.wordpress.com/3077/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=jellymatter.com&#038;blog=19982511&#038;post=3077&#038;subd=jellymatter&#038;ref=&#038;feed=1" width="1" height="1" />]]></content:encoded>
			<wfw:commentRss>http://jellymatter.com/2012/08/29/randomised-controlled-trials-the-gold-standard/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
	
		<media:content url="http://2.gravatar.com/avatar/8c84fe8613248465600ad95e94393d40?s=96&#38;d=identicon&#38;r=G" medium="image">
			<media:title type="html">jamesthorniley</media:title>
		</media:content>
	</item>
		<item>
		<title>The Chemistry of Economics</title>
		<link>http://jellymatter.com/2012/08/01/the-chemistry-of-economics/</link>
		<comments>http://jellymatter.com/2012/08/01/the-chemistry-of-economics/#comments</comments>
		<pubDate>Wed, 01 Aug 2012 12:00:44 +0000</pubDate>
		<dc:creator>Nathaniel Virgo</dc:creator>
				<category><![CDATA[Opinion]]></category>
		<category><![CDATA[agriculture]]></category>
		<category><![CDATA[autocatalysis]]></category>
		<category><![CDATA[chemical reaction networks]]></category>
		<category><![CDATA[chemistry]]></category>
		<category><![CDATA[economics]]></category>
		<category><![CDATA[fossil fuels]]></category>
		<category><![CDATA[haber-bosch process]]></category>
		<category><![CDATA[nom nom nom]]></category>

		<guid isPermaLink="false">http://jellymatter.com/?p=2577</guid>
		<description><![CDATA[In order to understand economics, you must first understand chemistry.  That&#8217;s my story at least, and I&#8217;m sticking to it.  I&#8217;m neither an economist nor a chemist (not a real one anyway), but I&#8217;ve been thinking a lot about how to understand economics in chemical terms. In a previous post I discussed autocatalysis, the mechanism by [&#8230;]<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=jellymatter.com&#038;blog=19982511&#038;post=2577&#038;subd=jellymatter&#038;ref=&#038;feed=1" width="1" height="1" />]]></description>
				<content:encoded><![CDATA[<p>In order to understand economics, you must first understand chemistry.  That&#8217;s my story at least, and I&#8217;m sticking to it.  I&#8217;m neither an economist nor a chemist (not a real one anyway), but I&#8217;ve been thinking a lot about how to understand economics in chemical terms.</p>
<p>In <a href="http://jellymatter.com/2011/12/12/the-primordial-haze/">a previous post</a> I discussed <em>autocatalysis</em>, the mechanism by which a bunch of different molecules can react with each other in such a way that they end up producing more of themselves, at the cost of using something else up.  The ideas in that post don&#8217;t only apply to chemistry &#8211; you can use them to think about just about any kind of physical process.  In this post I&#8217;ll talk about how to think about the economy as a whole in autocatalytic terms. But let&#8217;s start with something on a smaller scale, the process of baking bread:</p>
<p><a href="http://jellymatter.files.wordpress.com/2011/12/screen-shot-2011-12-27-at-18-22-32.png"><img class="alignnone size-medium wp-image-2579" title="Screen shot 2011-12-27 at 18.22.32" src="http://jellymatter.files.wordpress.com/2011/12/screen-shot-2011-12-27-at-18-22-32.png?w=300&#038;h=291" alt="" width="300" height="291" /></a></p>
<p><span id="more-2577"></span></p>
<p>(You can click on the images for bigger versions.) This is a variation on the diagrams I used in my previous post, except that I&#8217;ve changed the conventions a bit: I&#8217;m now using boxes to represent &#8220;stuff&#8221;, because it&#8217;s more traditional and because it&#8217;s easier to fit text into them, and I&#8217;m using hexagons to represent processes, because hexagons are cool.</p>
<p>Arrows always either go from a box (stuff) to a hexagon (process), meaning the process uses up the stuff, or the other way around, meaning the process creates stuff. The above diagram means that the process of baking takes in water, fuel, flour and human labour to produce bread.  It also uses up oxygen and gives off carbon dioxide, mostly because baking involves burning the fuel to create heat. I haven&#8217;t drawn the heat on the diagram, but it could be drawn just like the other products. In chemistry terms, the heat given off per unit bread produced would be called the enthalpy of the baking reaction.</p>
<p>The baker&#8217;s shop itself acts as a catalyst &#8211; it doesn&#8217;t get created or used up by the baking process, but it&#8217;s necessary in order for the process to occur.  I&#8217;m assuming the baker grows her own yeast, so she always ends up with a bit more yeast than she started with.  This means that, according to the definition in my previous post, the yeast is autocatalytic. This shouldn&#8217;t be surprising, because making more of themselves is what organisms do.</p>
<p>If I was being rigorous I would put numbers on all of the arrows, indicating how much of each of these things gets used or created, per unit of bread. In chemistry these would be called the &#8220;stoichiometric coefficients&#8221; of the baking reaction, and the coefficient on the outgoing yeast arrow would be slightly higher than the one on the incoming arrow. Since I don&#8217;t know the numbers and don&#8217;t want to clutter the diagram, I&#8217;ve drawn the outgoing arrow as a double one as a kind of shorthand for this.</p>
<p>This picture is a huge simplification of course.  In reality baking consists of many processes, including making the dough, allowing it to rise, kneading it, the burning of the fuel (which might take place miles away in a power station), and the actual conversion of dough into bread in the oven.  But it&#8217;s OK to lump all this together into single process as long as we remember that it&#8217;s really composed of lots of separate ones. (Chemists do this all the time.)</p>
<p>But instead of zooming in deeper into the details of the baking process, let&#8217;s zoom out a bit and look at what happens to the bread after it&#8217;s made:</p>
<p><a href="http://jellymatter.files.wordpress.com/2011/12/screen-shot-2011-12-27-at-18-43-03.png"><img class="alignnone size-medium wp-image-2599" title="Screen shot 2011-12-27 at 18.43.03" src="http://jellymatter.files.wordpress.com/2011/12/screen-shot-2011-12-27-at-18-43-03.png?w=300&#038;h=277" alt="" width="300" height="277" /></a></p>
<p>I&#8217;ve added a second process to the diagram, representing the bread being eaten by people.  This uses up oxygen and produces carbon dioxide, because breathing is an integral part of how we extract energy from food. It also uses up some other food groups (otherwise you&#8217;d get rickets) and produces sewage, which must be disposed of.</p>
<p>I&#8217;ve also drawn it as producing &#8220;human labour&#8221;, representing the fact that once people have been fed they are able to work. Housing acts as a catalyst because it&#8217;s very difficult for people to eat and work if they don&#8217;t have somewhere to live.</p>
<p>People also reproduce, of course, and we need more things than these basics in order to live and work.  As in the case of baking, a more detailed diagram could include these things, but this one gets the point across well enough for now.</p>
<p>The double arrow from eating to labour represents the fact that you get more labour out of the combined baking/eating process than you put in, otherwise everyone would spend all of their time making bread and no-one would be able to do any other jobs. (It would be possible to work this out from the stoichiometric coefficients of the two reactions if they were included in the diagram.)  This results in an autocatalytic cycle, highlighted below:</p>
<p><a href="http://jellymatter.files.wordpress.com/2011/12/screen-shot-2011-12-30-at-12-58-42.png"><img class="alignnone size-medium wp-image-2619" title="Screen shot 2011-12-30 at 12.58.42" src="http://jellymatter.files.wordpress.com/2011/12/screen-shot-2011-12-30-at-12-58-42.png?w=300&#038;h=264" alt="" width="300" height="264" /></a></p>
<p>More bread leads to more eating leads to more labour leads to more baking leads to more bread &#8211; as long as there&#8217;s a big enough supply of fuel, water, flour, oxygen, etc.</p>
<p>Here&#8217;s the big point I want to make in this post: the way the economy grows is through this type of autocatalytic cycle.  Economic growth is all about the physical self-reproduction of physical stuff.  It&#8217;s about more stuff leading to more stuff leading to more stuff, and when we say something contributes to the economy, we really mean it plays a role in this self-reproduction of stuff.  All this might sound terribly materialistic, but it&#8217;s not meant to be &#8211; the goal here is to understand economic growth, but I&#8217;m not saying it&#8217;s always necessarily a good thing.  Things like a high quality of living, high employment and a healthy environment are important, and in an ideal world the economy would be organised to maximise these things instead of, or as well as, growth.</p>
<p>At this point you might be thinking I&#8217;ve neglected something that&#8217;s usually considered very important when it comes to the study of economics: money.  You&#8217;d be right, but I think one of the biggest mistakes you can make when trying to understand economics is to think it&#8217;s all about money.  Money is an important tool by which the physical processes of the &#8220;real&#8221; economy are organised, but if we lived in a money-less command economy the processes in the diagram above would still happen, they&#8217;d just happen for different reasons.  They also happen if you bake your own bread at home. In that case no money changes hands, but the process still produces bread, still produces more capacity for labour, and thus still contributes to the economy.  Although money is terribly important in the general scheme of things, and it&#8217;s certainly important to understand it, I&#8217;ve chosen to try and understand the physical basis of the real economy first.</p>
<p>So anyway, let&#8217;s expand out even further from the diagrams above, and try to see where some of those processes&#8217; other inputs come from.</p>
<p><a href="http://jellymatter.files.wordpress.com/2011/12/screen-shot-2011-12-27-at-18-43-27.png"><img class="alignnone size-medium wp-image-2601" title="Screen shot 2011-12-27 at 18.43.27" src="http://jellymatter.files.wordpress.com/2011/12/screen-shot-2011-12-27-at-18-43-27.png?w=300&#038;h=227" alt="" width="300" height="227" /></a></p>
<p>I&#8217;ve included farming and the mining of fossil fuels as well as the economically hugely important production of nitrogen fertiliser through an industrial process called the <a href="http://en.wikipedia.org/wiki/Haber_process">Haber-Bosch process</a>, which uses up quite a lot of energy. (Mind-blowing stat: globally, we fix more nitrogen per year through this industrial process than the entire biosphere put together.)</p>
<p>Ultimately everything on this diagram is powered by two sources: sunlight and buried fossil fuels.  One of these will last for a few more billion years but the other won&#8217;t, so this picture clearly represents a very temporary state of affairs. (The availability of fresh water is also ultimately due to the sun via the water cycle, but I haven&#8217;t put that on the diagram. You have to stop somewhere.)</p>
<p>A lot of the power from sunlight enters the system indirectly through the production of oxygen by natural ecosystems. One of the things I like about this approach is how quickly you have to deal with the fact that the economy can&#8217;t be seen as a separate system from the natural environment.</p>
<p>This diagram is a massive over-simplification of the real picture, and I&#8217;m sure I&#8217;ve left off some important arrows, but it&#8217;s already become very complicated.  Studying it for a while reveals that it&#8217;s absolutely chock full of autocatalytic cycles.  Most of them involve the food-human interaction.  More farming means more grain means more bread means more humans means more farming.  Or more fossil fuel burning means more <img src='http://s0.wp.com/latex.php?latex=%5Ctext%7BCO%7D_2&amp;bg=ffffff&amp;fg=333333&amp;s=0' alt='&#92;text{CO}_2' title='&#92;text{CO}_2' class='latex' /> in the atmosphere means higher crop productivity means more humans means more fossil fuel mining means more fossil fuel gets burnt. (As you might expect from the diagram, carbon dioxide acts as a fertiliser.)</p>
<p>There&#8217;s also at least one autocatalytic cycle that doesn&#8217;t involve the creation of more humans: oil mining uses fuel but produces more fuel than it uses up (otherwise it wouldn&#8217;t be worthwhile). In general the growth of a business is autocatalytic, as well as the growth of the economy as a whole.  A business invests in capital, which then (hopefully) produces a return larger than the investment, allowing the purchase or construction of more capital.  It&#8217;s interesting to think about what these cycles look like for different types of business, but I&#8217;ll leave that for another time.</p>
<p>For now I&#8217;ve got to the point I wanted to get to with this post, where I&#8217;ve shown that the economy can be seen as a huge network of interrelated physical processes, some of which can be thought of as being like chemical reactions and others of which (like the Haber-Bosch process) actually are chemical reactions. I&#8217;ll leave you with the thought that other place where you&#8217;re likely to find big, complicated diagrams mapping interlinked networks of chemical processes is molecular biology, where these diagrams map out the metabolism of an organism. It&#8217;s an interesting comparison to think about.</p>
<br />  <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/jellymatter.wordpress.com/2577/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/jellymatter.wordpress.com/2577/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=jellymatter.com&#038;blog=19982511&#038;post=2577&#038;subd=jellymatter&#038;ref=&#038;feed=1" width="1" height="1" />]]></content:encoded>
			<wfw:commentRss>http://jellymatter.com/2012/08/01/the-chemistry-of-economics/feed/</wfw:commentRss>
		<slash:comments>3</slash:comments>
	
		<media:content url="http://0.gravatar.com/avatar/3d3d01f62eb81589b829f28c8c9f6cd0?s=96&#38;d=identicon&#38;r=G" medium="image">
			<media:title type="html">Nathaniel</media:title>
		</media:content>

		<media:content url="http://jellymatter.files.wordpress.com/2011/12/screen-shot-2011-12-27-at-18-22-32.png?w=300" medium="image">
			<media:title type="html">Screen shot 2011-12-27 at 18.22.32</media:title>
		</media:content>

		<media:content url="http://jellymatter.files.wordpress.com/2011/12/screen-shot-2011-12-27-at-18-43-03.png?w=300" medium="image">
			<media:title type="html">Screen shot 2011-12-27 at 18.43.03</media:title>
		</media:content>

		<media:content url="http://jellymatter.files.wordpress.com/2011/12/screen-shot-2011-12-30-at-12-58-42.png?w=300" medium="image">
			<media:title type="html">Screen shot 2011-12-30 at 12.58.42</media:title>
		</media:content>

		<media:content url="http://jellymatter.files.wordpress.com/2011/12/screen-shot-2011-12-27-at-18-43-27.png?w=300" medium="image">
			<media:title type="html">Screen shot 2011-12-27 at 18.43.27</media:title>
		</media:content>
	</item>
		<item>
		<title>Can plants learn?</title>
		<link>http://jellymatter.com/2012/07/30/can-plants-learn/</link>
		<comments>http://jellymatter.com/2012/07/30/can-plants-learn/#comments</comments>
		<pubDate>Mon, 30 Jul 2012 10:00:03 +0000</pubDate>
		<dc:creator>Nathaniel Virgo</dc:creator>
				<category><![CDATA[Opinion]]></category>
		<category><![CDATA[experiments]]></category>
		<category><![CDATA[leaning]]></category>
		<category><![CDATA[plant biology]]></category>
		<category><![CDATA[plants]]></category>
		<category><![CDATA[questions]]></category>

		<guid isPermaLink="false">http://jellymatter.com/?p=2944</guid>
		<description><![CDATA[This post is about an idea I&#8217;ve had for a long time, about an experiment to test whether plants can learn. I&#8217;m very far from being a plant biologist, so I&#8217;m unlikely to ever be in a position to do this experiment, but it&#8217;s an interesting thing to think about. Learning isn&#8217;t something we normally [&#8230;]<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=jellymatter.com&#038;blog=19982511&#038;post=2944&#038;subd=jellymatter&#038;ref=&#038;feed=1" width="1" height="1" />]]></description>
				<content:encoded><![CDATA[<p>This post is about an idea I&#8217;ve had for a long time, about an experiment to test whether plants can learn. I&#8217;m very far from being a plant biologist, so I&#8217;m unlikely to ever be in a position to do this experiment, but it&#8217;s an interesting thing to think about.</p>
<p><span id="more-2944"></span>Learning isn&#8217;t something we normally associate with plants, of course. We don&#8217;t normally think of plants as behaving at all &#8211; but of course they do, it&#8217;s just that they generally do it much more slowly than animals. Anyone who&#8217;s ever pitched a tent for a few days at a time will have noticed that the grass underneath grows tall and thin and pale. It does this in an attempt to find some light. If the tent were instead a small rock or fallen tree then the grass would be able to grow its way out from underneath and get a source of energy again.</p>
<p>But is this learning? No &#8211; it&#8217;s an evolutionarily in-built response, equivalent to what in animals we could call an instinctive or reflex response. You could say it&#8217;s a kind of learning that&#8217;s taken place on an evolutionary time scale (some the grass&#8217; ancestors grew in this way an survived, so grass as a whole &#8220;learnt&#8221; to do it), but we&#8217;re interested in learning on the time scale of an individual.</p>
<p>So what makes a behaviour &#8220;learning&#8221;? There are probably a lot of different definitions depending on exactly what question you want to ask, but the one I&#8217;m interested in here is classical, or Pavlovian, conditioning. Pavlov noticed that dogs salivate in preparation for eating, and discovered that if he rang a bell every time he fed them, they would salivate whenever he rang the bell. The point is that the bell is otherwise an arbitrary thing &#8211; it doesn&#8217;t cause salivation unless the dog has learned to associate it with food.</p>
<p>So could we do the same thing with plants? It seems to me that it wouldn&#8217;t be too hard. Plants don&#8217;t salivate, but they do have their own set of &#8220;reflex&#8221; responses, including the response to low light that I mentioned above. Another is the response to grazing by animals, which usually involves growing tougher leaves, and perhaps producing some bitter-tasting or toxic compounds in order to deter further grazing.</p>
<p>The basic idea of the experiment would be something like this: we grow a whole load of plants at the same time, under identical conditions. These plants are divided into a test group and a control group. For the test group, at some randomly determined time we expose them to some arbitrary stimulus. This has to be something the plant is able to detect, but it shouldn&#8217;t be something that normally produces a strong response &#8211; I guess it could be something like a small change in soil pH, or blocking out part of the light for a while. In classical conditioning terms, this is called the &#8220;conditioned stimulus&#8221; (CS).  Then, some time later (perhaps the next day) we apply a different stimulus, one that the plant will respond to (but which will not kill it). For example, we could block out all the light for a couple of days, or we could clip the leaves to simulate grazing. This is called the &#8220;unconditioned stimulus&#8221; (US). Some time later we do this again &#8211; apply the conditioned stimulus, then apply the unconditioned stimulus. This is repeated several times. Finally, we apply the conditioned stimulus but <em>not</em> the unconditioned one and observe the results.</p>
<p>For the control group we apply both stimuli, but we apply <em>both</em> stimuli at randomly determined times. So the plants have been exposed to the same conditions overall, except that there&#8217;s no point in learning an association between the conditioned stimulus and the unconditioned stimulus, because they&#8217;re uncorrelated to one another.</p>
<p>A successful result from this experiment would be that the test group responds to the conditioned stimulus as if it were the unconditioned one. So we change the pH slightly and the plants start to grow tall and thin, or we block out part of the light and they produce anti-grazing toxins. But crucially, it&#8217;s not a successful result unless the control group <em>doesn&#8217;t</em> react in this way. This shows us that the response to the US is a learned one, and not just something that all plants of that species would do anyway.</p>
<p>Of course there are a lot of subtleties involved in running such an experiment, most of which I&#8217;m probably not aware of. An unsuccessful result wouldn&#8217;t be very interesting &#8211; it would only show that we were using stimuli that that species of plant can&#8217;t respond to. If plants can learn, some are probably better at it than others. The experiment would be easier to perform with a fast-growing weed (I was considering trying it myself with watercress at one point), but I suspect that trees are probably more likely to be smart, since they have more invested in staying alive for a long time.</p>
<p>This would be a very interesting experiment to try. Although it might seem unlikely to be successful, it probably isn&#8217;t all that hard to carry out, on the general scale of plant experiments. If any experimental plant biologist ever reads this who would like to give it a try, please feel free to get in touch!</p>
<br />  <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/jellymatter.wordpress.com/2944/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/jellymatter.wordpress.com/2944/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=jellymatter.com&#038;blog=19982511&#038;post=2944&#038;subd=jellymatter&#038;ref=&#038;feed=1" width="1" height="1" />]]></content:encoded>
			<wfw:commentRss>http://jellymatter.com/2012/07/30/can-plants-learn/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
	
		<media:content url="http://0.gravatar.com/avatar/3d3d01f62eb81589b829f28c8c9f6cd0?s=96&#38;d=identicon&#38;r=G" medium="image">
			<media:title type="html">Nathaniel</media:title>
		</media:content>
	</item>
		<item>
		<title>Ironic science, pragmatism, and the &#8220;is best viewed as&#8221; argument</title>
		<link>http://jellymatter.com/2012/07/09/ironic-science-pragmatism-and-the-is-best-viewed-as-argument/</link>
		<comments>http://jellymatter.com/2012/07/09/ironic-science-pragmatism-and-the-is-best-viewed-as-argument/#comments</comments>
		<pubDate>Mon, 09 Jul 2012 14:22:00 +0000</pubDate>
		<dc:creator>James Thorniley</dc:creator>
				<category><![CDATA[Opinion]]></category>
		<category><![CDATA[Reviews]]></category>
		<category><![CDATA[Chemero]]></category>
		<category><![CDATA[cognition]]></category>
		<category><![CDATA[computationalism]]></category>
		<category><![CDATA[dynamical systems]]></category>
		<category><![CDATA[Einstein]]></category>
		<category><![CDATA[Haken-Kelso model]]></category>
		<category><![CDATA[Horgan]]></category>
		<category><![CDATA[Irony]]></category>
		<category><![CDATA[postmodernism]]></category>
		<category><![CDATA[radical embodied cognitive science]]></category>
		<category><![CDATA[Relativity]]></category>
		<category><![CDATA[String theory]]></category>
		<category><![CDATA[The End of Science]]></category>
		<category><![CDATA[Witten]]></category>

		<guid isPermaLink="false">http://jellymatter.com/?p=2911</guid>
		<description><![CDATA[I&#8217;ve read a couple of interesting books recently, one was &#8220;The End of Science&#8221; by John Horgan, and the other was &#8220;Radical Embodied Cognitive Science&#8221; by Anthony Chemero. Horgan&#8217;s theme was the question of whether the fundamentals of science are now so solid that before long nothing genuinely &#8220;new&#8221; will be left to find, and [&#8230;]<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=jellymatter.com&#038;blog=19982511&#038;post=2911&#038;subd=jellymatter&#038;ref=&#038;feed=1" width="1" height="1" />]]></description>
				<content:encoded><![CDATA[<p>I&#8217;ve read a couple of interesting books recently, one was &#8220;The End of Science&#8221; by John Horgan, and the other was &#8220;Radical Embodied Cognitive Science&#8221; by Anthony Chemero. Horgan&#8217;s theme was the question of whether the fundamentals of science are now so solid that before long nothing genuinely &#8220;new&#8221; will be left to find, and science will be reduced to either obsolescence, or puzzle-solving type application of existing theories to particular problems. The only other type of science that still exists, according to Horgan, is &#8220;ironic&#8221; science. A kind of semi-postmodern project to explain or describe what we already know in more &#8220;beautiful&#8221; or appealing forms, but which never produces hypotheses that are empirically testable, and for this reason, don&#8217;t <em>actually</em> advance knowledge. Horgan is distinctly dismissive of this kind of science, as being not &#8220;proper&#8221; science (he deliberately compares it to postmodern literary criticism, which he seems to have particular contempt for, having once been a student of it himself). Chemero would be, I&#8217;m sure, classified by Horgan as an ironic scientist. I don&#8217;t think Chemero would be able to deny that in a sense, his philosophy is empirically untestable, but he certainly argues that it <em>is</em> pragmatic in the sense of being useful to scientists engaged in solving real world problems.</p>
<p><span id="more-2911"></span></p>
<p>Horgan seems to be particularly infuriated by string theorist Edward Witten. String theory, by the sounds of things, is the ultimate ironic science. The purpose of string theory is to unify Einstein&#8217;s general relativity with quantum mechanics, which it does by positing the existence of new underlying physical laws. The new laws only really exist in the mathematical world, and don&#8217;t add anything to quantum mechanics and relativity that empirical scientists could ever hope to find. Since most physicists are quite happy to use the quantum mechanical and/or relativistic foundations with no obvious loss of modelling capability, string theory seems to be in a sense pointless. The likes of Witten are only intrigued by it because they are actually mathematicians more than physicists, and in the mathematical domain it does actually represent a significant body of new findings. (Incidentally, I know nothing about string theory, I&#8217;m just summarising Horgan&#8217;s argument).</p>
<p>I think the &#8220;ironic&#8221; label can easily be applied to authors like Chemero (and there are many others who make similar arguments). Chemero&#8217;s position centres around an apparent dichotomy between viewing cognitive systems as computers and viewing them as &#8220;dynamical systems&#8221; (the latter being the &#8220;radical embodied&#8221; stance). From the computationalist viewpoint, our brains are like computers than act on symbols that &#8220;represent&#8221; meaningful properties of the world we live in &#8211; the states of neurons encode such representations, and the responses from other neurons perform manipulations, computations, and eventually encode an output state (such as motor activation). In the radical embodied paradigm, our brains are simply part of a larger physical system comprised of the brain itself, its body, and the environment it finds itself in. In much the same way as a tennis ball traces a trajectory in space as it is hit by rackets and falls at some point to earth, our neurons (as a collective) are simply tracing out a trajectory though the space of all their possible activation states in accordance with physical laws (obviously, brains obey much more complex trajectories than something passive like a ball, but the modelling approach is essentially the same).</p>
<p>It&#8217;s worth mentioning that there is in fact a &#8220;middle ground&#8221; which Chemero calls (non-radical) embodied cognition, which considers the role of the body in cognition as important but does not absolutely refuse to accept the existence of any type of computational symbol manipulation.</p>
<p>A problem for people like Chemero is that they are often interpreted as trying to argue that &#8220;the brain <em>is</em> a dynamical system&#8221;. Such a statement would be trivial almost to the point of absurdity &#8211; a dynamical system is really just a model, and the dynamical systems modelling approach can be applied to all sorts of physical systems. Stating that the brain &#8220;is&#8221; a dynamical system is nothing more than saying that the brain is made of physical-stuff, a viewpoint with which arch computationalists would surely agree.</p>
<p>To really drive the point home, not only is it trivial that a brain is a dynamical system, it is also trivially true that any computer (such as the one you are reading this on) is a dynamical system, if that is how you choose to view it.</p>
<p>I would also argue that it&#8217;s pretty much just as valid for any dynamical system to be viewed as a computer, if that&#8217;s what you want to do. So in exactly the same way, the statement &#8220;the brain <em>is</em> a computer&#8221; is again true, but trivially so.</p>
<p>Since both statements are obviously true, debating which one is more correct seems like a waste of time, and hence a candidate for the &#8220;ironic&#8221; science label. However, the real crux of Chemero&#8217;s argument is not that the brain &#8220;is&#8221; a dynamical system, but that it is &#8220;best viewed as&#8221; a dynamical system. A simple experiment to do is to hold out your arms so that your forearms are pointed upwards, bent at the elbow, then wave your forearms side to side as if they were windscreen wipers (both left at the same time then both right at the same time). Try and do this faster and faster and you should find that it becomes increasingly difficult to keep your arms in sync in the windscreen wiper-way &#8211; in fact, they are likely to sync up the opposite way, with both pointing inwards together then pointing outwards together. This phenomenon has been experimentally observed in a lot of similar scenarios and explained using a <a href="http://www.scholarpedia.org/article/Haken-Kelso-Bunz_model">dynamical systems model</a>. Chemero&#8217;s point is that when you think of the behaviour as the result of a dynamical system, it&#8217;s actually quite easily to model it and thus gain some understanding of how your brain works. If you wanted to take a computational approach, where you insist that the brain must be working on some internal representation of the movement of your arms, it actually seems quite convoluted and unnecessarily difficult to describe this phenomenon in any meaningful detail.</p>
<p>This is an &#8220;is best viewed as&#8221; argument, and it&#8217;s inherently pragmatic &#8211; the whole point is that thinking of the brain in a different way is supposed to make real, empirical science easier to do. However, the &#8220;is best viewed as&#8221; argument is still ironic in the sense that it, <em>itself</em>, is completely untestable. Clearly whether something &#8220;is best viewed as&#8221; can&#8217;t be measured using scientific instruments, it&#8217;s a value judgement as to what is the most productive approach for scientists.</p>
<p>So really, what I wanted to say, is that &#8220;ironic&#8221; science is not really as wasteful as Horgan makes out. Perhaps string theory goes to extremes in terms of mathematical obfuscation and is really too separated from reality. However, science that involves making an &#8220;is best viewed as&#8221; argument does, at least potentially, have a lot of value. I also think that (perhaps) you could even say that the things that Horgan clearly does view as genuine scientific contributions are actually themselves ironic &#8220;is best viewed as&#8221; arguements. General relativity, for example, views gravity as a property of a four dimensional &#8220;spacetime&#8221;, not because the existence of spacetime can be empirically tested, but because doing so generates models of physical systems which can be tested understood (at least by people sufficiently familiar with the theory), and thus it is pragmatically useful.</p>
<br />  <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/jellymatter.wordpress.com/2911/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/jellymatter.wordpress.com/2911/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=jellymatter.com&#038;blog=19982511&#038;post=2911&#038;subd=jellymatter&#038;ref=&#038;feed=1" width="1" height="1" />]]></content:encoded>
			<wfw:commentRss>http://jellymatter.com/2012/07/09/ironic-science-pragmatism-and-the-is-best-viewed-as-argument/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
	
		<media:content url="http://2.gravatar.com/avatar/8c84fe8613248465600ad95e94393d40?s=96&#38;d=identicon&#38;r=G" medium="image">
			<media:title type="html">jamesthorniley</media:title>
		</media:content>
	</item>
	</channel>
</rss>
