Forecasting the Growth of Complexity and Change

 

 

THEODORE MODIS1

Technological Forecasting & Social Change, 69, No 4, 2002 

 

ABSTRACT

 

     In the spirit of punctuated equilibrium, complexity is quantified relatively in terms of the spacing between equally important evolutionary turning points (milestones). Thirteen data sets of such milestones, obtained from a variety of scientific sources, provide data on the most important complexity jumps between the big bang and today. Forecasts for future complexity jumps are obtained via exponential and logistic fits on the data. The quality of the fits and common sense dictate that the forecast by the logistic function should be retained. This forecast stipulates that we have all ready reached the maximum rate of growth for complexity, and that in the future complexity's rate of change (and the rate of change in our lives) will be declining. One corollary is that we are roughly halfway through the lifetime of the Universe. Another result is that complexity's rate of growth has built up to its present high level via seven evolutionary sub processes, themselves amenable to logistic description.

 

 

1. Introduction

Change has always been an integral feature of life. "You cannot step twice in the same river", said Heraclitus—who has been characterized as the first Western thinker—illustrating the reality of permanent change. Heraclitus invoked an incontrovertible law of nature according to which everything is mutable, “all is flux.” In the physics tradition such laws are called universal laws, for example, the second law of thermodynamics, which stipulates that entropy always increases, and explains such things as why there can be no frictionless motion. In fact, there are theories that link the accumulation of complexity to the dissipation of entropy, or wasted heat.

     The accelerating amount of change in technology, medicine, information exchange, and other social aspects of our life, is familiar to everyone. Progress—questionably linked to technological achievements—has been following progressively increasing growth rates. The exponential character of the growth pattern of change is not new. Whereas significant developments for mankind crowd together in recent history, they populate sparsely the immense stretches of time in the earlier world. The marvels we witnessed during the 20th century surpass what happened during the previous one thousand years, which in turn is more significant than what took place during the many thousands of years that humans lived in hunting-gathering societies. What is new is that we are now reaching a point of impasse, where change is becoming too rapid for us to follow. The amount of change we are presently confronted with is approaching the limit of the untenable. Many of us find it increasingly difficult to cope effectively with an environment that changes too rapidly.

     What will happen if change continues at an accelerating rate? Is there a precise mathematical law that governs the evolution of change and complexity in the Universe? And if there is one, how universal is it? How long has it been in effect and how far in the future can we forecast it? If this law follows a simple exponential pattern, we are heading for an imminent singularity, namely the absurd situation where change appears faster than we can become aware of it. If the law is more of a natural-growth process (logistic pattern), then we cannot be very far from its inflection point, the maximum rate of change possible.

 

2. The Task

     Change is linked to complexity. Complexity increases both when the rate of change increases and when the amount of things that are changing around us increase. Our task then becomes to quantify complexity, as it evolved over time, in an objective, scientific and therefore defensible way. Also to determine the law that best describes complexity's evolution over time, and then to forecast its future trajectory. This will throw light onto what one may reasonably expect as the future rate at which change will appear in society.

However, quantifying complexity is something easier said than done.

 

Complexity

We have seen much literature and extensive preoccupation of "hard" and "less hard" scientists with the subject of complexity. Yet we have neither a satisfactory definition for it, nor a practical way to measure it. The term complexity remains today vague and unscientific. In his best-selling book Out of Control Kevin Kelly concludes:[1]

 

How do we know one thing or process is more complex than another? Is a cucumber more complex than a Cadillac? Is a meadow more complex than a mammal brain? Is a zebra more complex than a national economy? I am aware of three or four mathematical definitions for complexity, none of them broadly useful in answering the type of questions I just asked. We are so ignorant of complexity that we haven't yet asked the right question about what it is.

 

But let us look more closely at some of the things that we do know about complexity today:

 

§      It is generally accepted that complexity increases with evolution. This becomes obvious when we compare the structure of advanced creatures (animals, humans) to primitive life forms (worms, bacteria).

§      It is also known that evolutionary change is not gradual but proceeds by jerks. In 1972 Niles Eldredge and Stephen Jay Gould introduced the term "Punctuated Equilibria": long periods of changelessness or stasis—equilibrium—interrupted by sudden and dramatic brief periods of rapid change—punctuations.[2]

 

These two facts taken together imply that complexity itself must grow in a stepladder fashion, at least on a macroscopic scale.

 

§      Another thing we know is that complexity begets complexity. A complex organism creates a niche for more complexity around it; thus complexity is a positive feedback loop amplifying itself. In other words, complexity has the ability to "multiply" like a pair of rabbits in a meadow.

§         Complexity links to connectivity. A network's complexity increases as the number of connections between its nodes increases, and this enables the network to evolve. But you can have too much of a good thing. Beyond a certain level of linking density, continued connectivity decreases the adaptability of the system as a whole. Kaufman calls it "complexity catastrophe": an overly linked system is as debilitating as a mob of uncoordinated loners.[3]

 

These two facts argue for a process similar to growth in competition. Complexity is endowed with a multiplication capability but its growth is capped and that necessitates some kind of a selection mechanism. Alternatively, the competitive nature of complexity's growth can be sought in its intimate relationship with evolution. One way or another, it is reasonable to expect that complexity follows logistic-growth patterns as it grows.

    

Milestones in the History of the Cosmos

     The first thing that comes to mind when confronted with the image of stepwise growth for complexity over time is the major turning points in the history of evolution. Most teachers of biology, biochemistry, and geology at some time or another present to their students a list of major events in the history of life. The dates they mention invariably reflect milestones of punctuated equilibrium (or "punk eek" for short). Physicists tend to produce a different list of dates stretching over another time period with emphasis mostly on the early Universe.

     Such lists constitute data sets that may be plagued by numerical uncertainties and personal biases depending on the investigator's knowledge and specialty. Nevertheless the events listed in them are "significant" because some investigator has singled them out as such among many others. Consequently they constitute milestones that can in principle be used for the study of complexity's evolution over time. However, in practice there are some formidable difficulties in producing a data set of turning points that cover the entire period of time (15 billion years).

     I made the bold hypothesis that a law has been in effect from the very beginning. This was not an arbitrary decision on my part. The suggestion came when I first looked at an early compilation of milestones. In any case, I knew that confrontation with real data would be my final judge. More than once in this paper I have turned to the scientific method as defined by experimental physicists, namely: Following an observation (or hunch), make a hypothesis, and see if it can be verified by real data.

 

The Challenges

     Here are the most challenging issues concerning this paper's methodology in order of decreasing importance, and the way they were dealt with:

 

1.    The complexity associated with a milestone must be quantified at least in relative terms. For example, how much complexity did the Cambrian explosion bring to the system compared to the amount of complexity added to the system when humans acquired speech?

 

To quantify the complexity associated with an evolutionary milestone we must look at the milestone's importance. Importance can be defined as equal to the change in complexity multiplied by the time duration to the next milestone. This definition has been derived in the classical physics tradition: you start with a magnitude (in our case Importance), you put an equal sign next to it, and then you proceed to list in the numerator whatever the quantity in question is proportional to, and in the denominator whatever it is inversely proportional to, keeping track of possible exponents and multiplicative constants. It is intuitively obvious that for a milestone Importance is linearly proportional to the amount of complexity added by the milestone, and also linearly proportional to how long the system survives unchanged following the milestone. The greater the complexity jump at a given milestone, or the longer the ensuing stasis, the greater the milestone's importance will be.

 

Importance = Complexity x Duration           (1)

 

The complexity change associated with a certain milestone will then be inversely proportional to the time period to the next milestone. And to the extent that we are considering milestones of comparable importance, we have a means of quantitatively comparing the change in complexity associated with each jump.

Following each milestone the complexity of the system increases by certain amount. At the next milestone there is another increase in complexity. Assuming that milestones are approximately of equal importance, and according to the above definition of importance we can conclude that the increase in complexity DCi associated with milestone i of importance I is

 

                                       I

DCi =  ——                                   (2)

                         DTi

 

where DTi the time period between milestone i and milestone i+1.

We thus have a relative measure of the complexity contributed by each milestone to the system. If milestones become progressively crowded together with time, their complexity is expected to become progressively larger, see Figure 1.

                       

Complexity per Milestone

Figure 1. To the extent that milestones of equal importance appear more frequently, their respective complexity increases. The area of each rectangle represents importance and remains constant. The scales of both axes are linear.

 

 

2.    The time frame is vast and the crowding of milestones in recent times is so dense that no logistic or exponential function can be used to describe the growth process.

 

A logistic function does not necessarily need to be a function of time. Moreover, there are processes for which our Euclidean conception of time is not appropriate. For this analysis a better-suited time variable is the sequential milestone number because this way we can handle the singularity as DT®0. Once forecasts are obtained for complexity jumps associated with future milestones we can use the definition of importance coupled with the equi-importance assumption to derive explicit dates for future milestones.

 

3.    Milestones from different evolutionary processes (cosmological, geological, biological, etc.) and by different authors (physicists, biologists, historians, etc.) need to be combined in a rigorous way. There is a need for normalization when authors furnish data sets with different numbers of milestones for the same chronological period.

 

The equi-importance assumption is key to dealing with both of these issues. If all milestones in a data set are equally important, then the corresponding complexity jumps—calculated as described in Challenge 1—are directly comparable no matter what evolutionary process they belong to. Similarly, if someone's data set contains more milestones that someone else's data set for the same chronological period, then the milestones in the former set must carry less importance than those in the latter. The data sets are normalized so that they give the same overall complexity contribution for the same time periods.

 

4.    How many turning points should an adequate data set contain? One can always argue that a large number of important events have been neglected.

 

If we consider only the top most important milestones, we can invoke Pareto's rule—also known as the 80/20 rule—to argue that 20 percent of all milestones account for 80 percent of all complexity acquired during the time period in question. Moreover dealing with only major milestones improves the equi-importance requirement. Milestones of large importance are by definition milestones of comparable importance. Naturally some of them will be more important than others, but the average importance will be a relatively large number, and the spread around this average a relatively small number. Therefore, on a first approximation we can treat all milestones as being of equal importance.

Remark: A milestones is assigned to a point in time, i.e. a date. If more than one event is associated with the same date, the milestone's importance reflects the sum total of the importance of all such events.

 

3. The Data

My first attempt to compile a set of milestones and determine a growth law from it turned out bittersweet. I analyzed 20 milestones compiled during a brainstorming session with colleagues. This early data set proved amenable to a description by a logistic curve, but the result was subsequently criticized on the ground that there could be bias in the choice of milestones. So I set out to find more objective data from independent and reliable sources in order to be able to defend them as unbiased.

     Searching the Internet for something like "Major Events in the History of..." yields scores of pointers and chronologies so-called timelines. Many of them have to do with some classroom assignment. Some of them stand out in terms of completeness and credibility.  I briefly present below six of the thirteen data sets I have retained. A complete list of the data used in the analysis, including milestone descriptions and dates, can be found in Appendix A.

 

§      The Cosmic Calendar. Carl Sagan has put together a one-year calendar matching the entire history of the Universe, and pointing out dates of major events.[4] The set consists of 47 milestones that cover the entire time period (big bang to present) but suffer somewhat from the calendar format. Time resolution becomes insufficient for milestones that fall in the same time bucket. It happens with the calendar's monthly buckets, and again later with the buckets of seconds. In fact, it seems that during these periods of saturated time resolution Sagan is enumerating milestones on a bucket-by-bucket basis reporting on things that happened during the time bucket, as if he is driven by the structure of the time buckets instead of the spacing of the events.

§      The data sets from Encyclopedia Britannica and the A.M.N.H. (American Museum of Natural History) are free from time-resolution distortions but are less exhaustive. They contain 16 and 20 milestones respectively.

§      Major Events in the History of Life. More than 1700 students, faculty, and other members of the UCLA community attended a "Major Events in the History of Life" symposium on January 11, 1991, convened by the IGPP Center for Study of Evolution and the Origin of life at the University of California. A volume was put together making accessible the proceedings of that symposium.[5]

§      Major Events in the Universe's History. Two physicists published a Scientific American article entitled "The Structure of the Early Universe." Their data set concerns events and dates covering the pre-human evolution of the universe.[6]

§      Professor Paul D. Boyer, biochemist, Nobel Prize 1997, kindly provided me with his own set of milestones for which I assigned the dates.

 

The data used in the analysis incorporate milestones from thirteen data sets, the last of which is the author's own. I decided to include a data set of my own for two reasons. First, I believe that having gone through all the research, I was well positioned to distill a rather complete, defensible, and scientific set of evolutionary milestones. Second, I needed data on the twentieth century, neglected by the other authors. From the 12 sets considered only Sagan's data set addresses the twentieth century, and his data are plagued by the calendar-format problem mentioned earlier.

From the 13 data sets only Paul Boyer's and mine were created in direct response to the question: Which are the 25 most significant milestones in the evolution of the Universe? The motivation of other authors, like Sagan and A.M.N.H., was to put events into a time perspective. But in so doing, they answered the same question simply by selecting what to list as major events.

Because of the different number of milestones between data sets, and the fact that different sets sometimes give different dates for the same event (e.g., the time of the big bang ranges from 13 to 20 billion years ago), I decided to derive a "canonical" set of milestones and use the spread between authors to calculate errors. My assumption was that there must be some coherence between the 13 data sets, i.e., many milestone dates must be common to most sets. Combining 13 data sets into one greatly reduces the uncertainties on the results.

 

The Canonical Set of Milestones

     Figure 2 shows a histogram of all milestone dates (a total of 302) with logarithmically increasing time buckets as we go backward in time. This choice of binning the data is not arbitrary. It became obvious when I plotted the 302 points on a number of linear graphs with different-size time buckets each. The logarithmically increasing time buckets are chosen in such a way that each bucket receives one cluster of milestones. The peak of each cluster is used to define a date for a milestone of the canonical set used as time variable in our analysis. There are twenty-eight canonical milestones but because of complexity's definition (Equation 2) there only twenty-seven peaks in Figure 2.

For each peak the average complexity change is calculated, as well as an error given by the spread around the peak (one standard deviation). For peaks featuring only one entry (for example, milestones during the last 100 years) I arbitrarily assign the average error as error. Fractional milestone numbers are assigned to all milestones according to their date.

 

Histogram of All Milestones

Figure 2. A histogram of all milestones with logarithmic time buckets. The thin black line is superimposed to outline the peaks that define the dates of the "canonical" milestones. On the horizontal axis we read the dates of these milestones.

 

 

4. The Analysis

     A distribution of the change of complexity per milestone for all thirteen data sets is shown in Figure 3. The different data sets have been normalized for equal cumulative complexity contributions over identical time periods. Consequently the units of the vertical axis are arbitrary to an overall multiplicative constant. The picture comparing the normalized data for all thirteen sources is rather coherent as there is good agreement between the different data sets. Furthermore the data points generally line up on a straight line in a semi-log plot, which is the hallmark of exponential growth, or alternatively, the early part of logistic growth. The milestone-number axis marks the milestones of the canonical set.

Complexity per Milestone

 

Figure 3. Thirteen different sources of data corroborate each other. The thin black line connects the canonical milestones (see text), and also represents the average complexity change at a given milestone. The vertical axis depicts the logarithm of the change in complexity.

 

     We can now proceed to fit the data with an exponential and a logistic function. Given that Figure 3 depicts complexity's rate of growth—i.e., complexity change per milestone—we expect the trend to follow the first derivative of the two functions. We therefore fit to the expressions:

 

(exponential)                              e(aX+b)                                    where a and b constants, and

 

ln(logistic life cycle)    ln               Ma                   .

                         (1+e-a(X- Xo))·(1+ea(X- Xo))                    where M, a, and xo constants

 

and x the sequential milestone number. The logistic life cycle is the first derivative of the familiar logistic function:

 

         M       .

  1+e-a(X- Xo)

 

     Figure 4 shows the canonical set of milestones with an exponential and a logistic fit superimposed. The logistic fit is better than the exponential one, (70% confidence level compared to 30%). Table I shows the particular details of the fits.

 

Table I - Fit Results

              Formula fit

       b

       a

      M

      xo

c2

Degrees of freedom

               (aX+b)

-23.749

0.7554

 

 

28.3

25

ln

 
                     Ma                   .

        (1+e-a(X- Xo))/(1+ea(X- Xo))

0.7735

0.1375

27.89

20.2

24

 

I have made an attempt to be scientifically correct. However, the reader should be aware that the Chi-square estimates (and the associated confidence levels) cannot reflect all uncertainties. There are sources of error that have not been properly accounted for. For example, errors due to having widely different dates for the same event (sometimes with good reason as the exact date is still being debated), or errors due to the approximation that the milestones are equally important.

 

Complexity per Milestone

 

Figure 4. Logistic and exponential fits to the data of the canonical milestone set. The vertical axis depicts the logarithm of the change in complexity. The faint circles on the forecasted trends indicate the complexity of future milestones.

 

The mid point of the logistic function is milestone number 27.89, which corresponds to 10 years ago. In other words, complexity grew at the highest rate ever around 1990. From then onward complexity's rate of change began decreasing. Future milestones of comparable importance will henceforth be appearing less frequently.

But according to the exponential law, milestones punctuating complexity jumps will continue appearing closer together at the same exponential rate, and 25 years from now we should expect successive turning points of the same importance to be spaced only 5 days apart. Table II spells out the timing of future milestones as expected from the logistic and exponential growth laws determined by the above fits.

 

Table II - Forecasts for Complexity Change as a Function of Time

Milestone

number

Logistic fit

Complexity change*     Years from now

Exponential fit

Complexity change*   Years from now

28

0.0265

38

0.0744

13.4

29

0.0223

45

0.1584

6.3

30

0.0146

69

0.3372

3.0

31

0.0081

124

0.7178

1.4

32

0.0041

245

1.5278

0.7

33

0.0020

508

3.2518

0.3

34

0.0009

1078

6.9213

0.1

35

0.0004

2315

14.7317

0.07

36

0.0002

5000

31.3558

0.03

37

0.0001

10800

66.7397

0.015

* In the same arbitrary units as Figures 3 and 4.

 

The accuracy of the results, as reflected in the significant digits retained in the numbers reported, may seem overly optimistic. However, the reader should bear in mind two things. First, that the curves are extremely steep; on linear time scale they would appear practically horizontal across billions of early-Universe years. Second, the significant digits in the results reflect more the precision of the method and less the accuracy of the answers because not all systematic errors have been accounted for (see earlier remark on sources of unaccounted errors).

 

 The Close-Up Picture

The case can be made, if less rigorously, for a finer structure in the evolution of the trajectory of complexity's change. It is has been shown that any growth processes may consist of smaller logistic sub-processes.[7] Looking at Figure 3 closely we can discern smaller S-shaped steps. Such structure indicates an alternation between periods when the milestones progressively crowd together and periods when they are roughly regularly spaced in time. This is largely due to the fact that as we move through time we encounter a number of rather well defined evolutionary sub processes. The thin black line in Figure 3 (representing the average change of complexity per milestone), suggests at least seven such sub processes. In figure 5 logistic curves are adapted to these segments.

The seven logistic curves do not result from rigorous fits to the data because of too few milestones and too much jitter on the data points in each segment (otherwise said, too large errors for the fitting procedure to work). The thick gray lines are logistic functions drawn in to simply guide the eye. However, the fair agreement between thick lines and the corresponding sections of the dotted line is evidence that we are dealing with rather independent natural-growth processes.

 

Different Sub Processes in the Evolution of Complexity

 

Figure 5. Seven small logistic curves have been superimposed to point out evidence for a finer structure. The dotted line is the same as the thin black line in Figure 3. The vertical axis depicts the logarithm of the change in complexity. The legend lists the sub processes in chronological order.

 

In order to better understand the seven sub processes, Table III lists the relevant parameters for each process. The mathematical parameters of the logistic functions being of less interest, it is preferable to give the dates corresponding to the 10%, 50%, and 90% penetration level for each process. The range 10%-90% of a logistic growth process is traditionally taken as the period of main thrust toward higher growth. Above the 90% level one can argue that a stable maximum level has been reached.

 

Table III - The Seven Phases of Complexity's Growth

Evolutionary process

10%

50%

90%

 

Years before present

    Cosmic

13,100,000,000

10,100,000,000

7,900,000,000

    Geological

  1,450,000,000

  1,050,000,000

   820,000,000

    Hominization

       19,500,000

         4,020,000

          625,000

    Homo sapiens

            434,000

            308,000

          239,000

    Modern human

            107,000

              38,200

            15,100

    Civilization

              10,700

                6,130

              5,000

    Scientific

                   539

                   225

                 100

 

     The names given to the seven phases have been inspired by what happened during each sub process. Consequently, "Cosmic" refers to the process around the formation of our galaxy. "Geological" refers to early forms of life and is centered on the appearance of multicellular life. "Hominization" is the period between the divergence of orangutan from Hominidae and the development of speech; it is centered on the appearance of first bipedalism and stone tools. "Homo sapiens" is a relatively short period dominated by Homo sapiens and the domestication of fire. "Modern human" extends between the first burial of the dead and the invention of agriculture; it is centered around the time of rock art, and includes ritual/spiritual behavior (magic shamanism). "Civilization" is a name inspired by city dwelling and religion becoming important; it is centered around the appearance of writing and the wheel. Finally, "Scientific" is the growth phase that begins with renaissance, and ends with modern physics; it is centered on the industrial revolution, and the establishment of scientific method.

 

 5. Discussion of Results

     This paper studies the evolution of complexity from the beginning of the Universe to present day. The hypothesis, verified via a successful logistic fit on data, is that a simple diffusion law has been governing complexity's growth across divers evolutionary processes (cosmological, geological, biological, etc.). We are obviously concerned with an anthropic Universe here since we are overlooking how complexity has been evolving in other parts of the Universe. Still, the author believes that such an analysis carries more weight than just the elegance and simplicity of its formulation. John Wheeler has argued that the very validity of the laws of physics depends on the existence of consciousness.2 In a way, the human point of view is all that counts!

     The work reported here links logistic growth and complexity in two different ways. One way is how complexity has been accumulating in the Universe along a large logistic curve (Figure 4). Another way is how complexity's rate of growth has been following smaller logistic curves in the close-up picture of Figure 5. There is a fundamental difference between these two pictures. The former involves an S-shaped pattern fitted to the amount of change accumulated whereas the latter involves fitting S-shaped patterns to the rate of change. In both cases evidence for logistic growth argues for natural growth in competition (Darwinian in nature), but the interpretations are different.

 

Seeing Complexity as a Competitive Growth Process

     Observation of logistic growth enables one to argue for the existence of Darwinian competition. Such competition implies that:

·        Some "species" is capable of growing via multiplication.

·        Members of the "species" compete for a limited resource.

·        There is natural selection.

In the logistic function of Figure 4 the "species" is the system's complexity and its members are the complexity chunks carried by the milestones. The limited resource is the system's cumulated final complexity. It is limited because too much complexity may hurt survival as per Kaufman's argument for complexity catastrophe mentioned earlier.

In the logistic functions of Figure 5 the "species" is the speed with which each evolutionary sub process proceeds, and its members are the jumps in speed during the rapid-growth phase (when turning points appear progressively more frequently). The limited resource is maximum speed, characteristic of the evolutionary sub process in question (e.g., geological evolution reached higher levels of c