Mrge Some Script

Published 2023 August 16, 01:43

This is a script for a Summer of Math Exposition event by 3 blue 1 brown channel.

The video describes the principles behind MRGE randomness extractor

Intro

01.png

Hello, adventurer. Today I want to share one peculiar mathematical construction that is used to extract randomness and map weighted trees

02.png

But before we proceed let’s first talk about what IS that randomness extraction all about

What is RE

First, let’s talk about a typical random number generator design.

1-1.png

It usually starts with a physical process that we cannot easily predict. For example let’s consider a radioactive source that emits …

…an electron. The presence of emitted electron can be detected with say a Geiger[gaige] counter which produces …

…spikes of current whenever an electron passes. With enough data from measuring time passed between the consecutive events we can …

..plot its distribution which will show that some time differences are more frequent than others. Since not all readings have equal probabilities the readouts cannot be used as a reliable random number generator by itself. And that’s when today’s hero appears:

the process of randomness extraction is mapping output of randomness source to evenly distributed numbers.

It is synonimous to say that we extract randomness from an entropy source to produce a stream of random numbers.

The most common use-case for random numbers is..

… cryptography which is widely used in our everyday life in the Internet. There is much interesting in each of the topics, but today we will …

…focus on the conversion between two distributions. Shall we check some examples first?

Let’s imagine a binary source which produces zeroes more frequently than ones. The most famous algorithm is

the Von-Neumann[van-noymen] extractor. The idea is to split source stream into pairs and map to the output according to the following rule: if the pair consists of two equal characters, then produce no output and proceed to the next pair. Otherwise produce zero or one depending on the order of symbols within the pair.

Another approach is to use special functions called…

.. hash-functions. Hash functions are typically defined in a way that a slightest change in the input basically reshuffles the output in a given range. This property is very handy for a purpose of equalising distribution of ones and zeroes in the extractor’s output.

And for another example let’s consider…

.. mixing two strings produced by independent sources into one. The simplest implementation of this extractor is to use a single-use pre-generated random numbers as the second source and the xor operation as a mixer. Xor will produce zero if it is fed two equal symbols and one if the input symbols are different.

But why bother inventing something else when we have some other else that is already working?

Although we are talking abut the three particular extractors the criterions and drawbacks are mostly the same among other. First, let’s look at hash function. Consider the one that maps numbers to an english dictionary. If there are only two possible inputs that we are able to get from the source then the function will produce an output stream of similar chunks of symbols

The same problem persists in the case of highly uneven distribution of input source.

Let’s analyze the Von-Neumann extractor. In the given example we can see that the long string of input is shortened to a much smaller one. Actually with a true random input stream of ones and zeroes the Von Neumann will produce about one fourth the size of the input sequence.

As for the multi-source approach its simplicity is balanced by a requirement to have not one but at least …

..two independent sources of entropy.

Can we do better? We surely can try to. But let’s focus on another problem that is closely related first

Let’s imagine you have designed a battlebot that has a potential to change the meta of combat robot sports. And you are looking for an …

..investor to your project. Investors all conveniently live in a single neighbourhood in four-storey houses. Chances to get funding are small but there are plenty of houses to try. However the investors are ..

4-2a

..communicating with each other. So if you fail to convince one, then the other investors in the house will deny your proposal too. And if they spot a pattern in your requests addressing then you will be banned in the whole neighbourhood. And they don’t tolerate anyone who ..

4-2b

..uses electronic devices too

You are driving to the meeting in your trusty Ben Des-Merci and trying to think of a good strategy. You have heard that those trying to get sequences from their heads have failed the test around house twenty. Your aim is to talk to at least one hundred of them. While pondering on the problem you notice there are a lot of …

..flies flying around. And what catches your attention specifically is that some of them pass through the hood emblem.

By arranging the holes ..

..you can get source of events that have equal probability among three possibilities.

How can one use this power of nature in his quest for perfection?

Let’s try to visualise the problem first.

Here’s a fly, there’re the holes. Each event results in one of the three equally probable outcomes.

A series of such events can be represented by a tree structure

We start with a desire and first event splits our path into three. The second event combined with the first one will produce a total of nine equally probable outcomes. The third fly will further triple the number resulting in twenty-seven unique branches.

Can we map this structure to a mere four-storey house?

Surely so. By grouping the outcomes into bunches of six we can map the fly branches into four house branches of a second tree.

There is also a little chance to being forced to get more data because we got into the “excess” fly branch. The one that cannot be mapped to a floor while preserving equal probabilities.

Can we do better? Indeed,…

.. we can. By building the second layer of our house tree we become able to map twice as many random floors with the same amount of random flies.

But can we bend the idea if the input events are not equally probable?

Unfortunately this elegant structure breaks when events have unequal probabilities.

for example in this case there is outcome red that occurs on average in the half of all events. While the other half is split equally between outcomes two, three, four and five.

The obvious problem with direct mapping is that not all of the outputs have the same probability

Is it the part when things become interesting? Buckle up

The key in working with weighted tree is to map it on auxiliary structure first

This structure is known to anyone who attempted to learn geometry and is called an interval

Shall we try it now?

Suppose there is that lovely half and eighths tree as an input source

We draw an interval that represents all the outcomes of our two-events process

The interval is of length one and is divided in a way to help us map the …

…probabilities of each outcome. On the first step we consider the..

..division that corresponds to the first layer of the tree.

Suppose the first event resulted in choosing the ..

..red node. This puts us in the first half of the interval that corresponds to probability one half with shift of zero.

If we were given this interval and the structure of the event tree we would be able to unambiguously identify the input

Since the second event is independent we can apply the same logic to it while picking our current OUTPUT interval …

.. as the full scale for mapping all possible outcomes of the ..

..SECOND event.

In this case it’s event five that shrinks the mapped output interval according to the size and order of the event interval.

Once again, specifying only the two ends of the interval may result in full reconstruction of input sequence provided we know the structure and weights of the tree

The same procedure can be ..

.. repeated recursively with new inputs further shrinking down the output interval

But the interval is quite feeble in itself. Can we do anything with it from here?

9a-1

And indeed we can. Let’s start with the interval and reverse the procedure in order to map it onto a different tree. Say, the one with ..

9a-2

five equally probable branches.

At the first step we see that this interval could be produced by a given tree provided the first event was ..

9a-3

..red, or number three in this case.

At the next step we ..

9a-4

..repeat the procedure and check if the interval could be produced by the second layer of this tree. Which in this case is a possibility since the input interval lies completely within the bounds of a branch three two mapped onto the interval zero to one

At the next step..

9a-5

however, we see that two branches are mapped onto the input interval, hence we cannot unabiguously identify a single one.

To solidify the procedure let’s see how it works in the tree of size three…

9a-6

… which results in a third layer branch with address 2 2 1 that can be recovered from the given input interval.

If at this point we put a binary tree as an output then we will be able to map essentially any distribution into a binary number that is evenly distributed due to the structure of the interval mapping

9a-7

This means you will be able to recover pretty much all the iformation from the source. Can we use this extractor to extract toys in the arcade?

11-1

Not directly, but let’s pay attention to the following: this extractor does not require numeric input. It will work with anyting that has probabilities that could be mapped onto an interval.

Suppose you are in arcade where you are given prizes at random: a toy forge, toy hammer, toy axe, toy boat or a puzzle box. And you struggle in your life with choosing a…

11-2

.. car manufacturer to sell a soul to. So you decide to map…

11-3

.. car preferences and play arcade until it is decided. The first trophy …

11-4

.. narrows down a list.

11-5

And the second one has ..

11-6

.. enough information to map the outcome. No numbers needed at all

11-7

But what if I don’t know the distribution beforehand?

Although most of the randomness extractors require knowledge of a randomness source, it is possible to surpass this limitation with our extractor.

10-1

Check this weird tree. It starts with one event at the top. And after we register the second event there are two possible outcomes: it is either the same or another event.

10-2

The third event depending on the history could be the same first orange event for the third time, or a second event, or even a third event in case we are in a branch that already has registered the second one.

10-3

The leftmose branches of this tree are trivial zero-information cases when we witness the same input each time

The rightmost branch is actually also a zero-information case since we don’t have an estimate for probability of unregistered outcomes

While ..

10-5

..inner branches all correspond to a certain non-trivial distribution that could be used to extract randomness.

10-6

And so on.

With this logic we can build a tree that structurises our knowledge of a source distribution based on the statistics that the source has provided.

10-7

So what have we learned about this lovely randomness extractor?

12-1

Using double arithmetic encoding for randomness extraction results in a peculiar mathematical structure that processes all information from the randomness source while not requiring any prior information about it. Of course this is not the ultimate..

12-2

extractor that will solve all the randomness extraction needs but it is nice at least.

final

And that concludes today’s topic of randomness extraction

final2

Thanks for watching and thanks for multiplying a number of people who is aware of this mostly dust-covered topic.

final3

Subscribe for a #RANDOMNESS EXTRACTION and don’t forget to invent your own randomness extractor for your custom os kernel

Aso if you want to learn more quantitatively about this tpic you could check two lectures on my channel that cover in-depth

TODO: Von-Neumann to Von Neumann at 2 and 3

?? Change toys to cars <> no apriori ??