Circularity In The Collatz Sequence Bradley Berg December 30, 2022 Updated: July 29, 2023
The Collatz series cycles after reaching a repeated value. If a cycle existed a sequence that starts with any value in the cycle will always loop back to itself. Also, any branch of a sequence that reaches any value in the cycle will repeatedly loop back to that value.
It is sufficient to consider only sequences that start at the lowest candidate Seed value in a potential cycle. For the Collatz series we start at the bottom of the loop so that once the series reaches a value lower than the Seed we know it is acyclic. After that, by induction, we know it will eventually terminate at one.
To misappropriate the terms from astrophysics, the point where the sequence dips below the starting Seed is the (event) horizon. A series that goes below the horizon collapses to the singularity, one.
Notation: #3a09 Hex values have a leading pound sign. X*Y Multiplication X^Y Exponentiation [ real ] Truncated integer part of a real number ln( x ) Natural log base e log2( x ) Log base 2
The original Collatz sequence is:
N is Even: N' = N / 2 N is Odd: N' = 3 * N + 1
A variation is to rewrite it as a series of Odd steps. In this form Ei is the number of Even steps in term i. Looking at a result from an Odd step this is the number of trailing zeros. It is always one or more.
N0 = Seed Ni * 3 + 1 Ni+1 = ----------- Ei
To expand this sequence it is useful to define a Sum, K, which gives the total number of even steps according to the original series.
K0 = 0 Ki = Ki-1 + Ei
Then we expand the sequence to form an iterative sequence.
N0 = Seed Only odd Seeds are allowed N1 = Seed * 3 + 1 / 2^K1 K1 = E1 N2 = Seed * 3 + 1 Seed*9 + 3 + 2^K1 ------------ * 3 + 1 = ----------------- 2^K1 2^K2 ---------------- 2^E2 N3 = Seed*9 + 3 + 2^K1 Seed*27 + 9 + 3*2^K1 + 2^K2 ----------------- * 3 + 1 = ---------------------------- 2^K2 2^K3 ----------------- 2^E3 i Ni = Seed*3^i + ∑ 3^(i-j) * 2^Kj-1 j=1 -------------------------------- 2^Ki
To simplify the expansion let Di signify the sum. This is useful since Di encapsulates the non-algebraic portion of the sequence known as the hailstone effect.
Seed * 3^i + Di Ni = --------------- 2^Ki i Di = ∑ 3^(i-j) * 2^Kj-1 j=1
Di can also be computed using a recursive sequence derived directly from the Collatz sequence. Using the recursive form is useful to validate that Di is properly calculated.
D0 = 0 Di+1 = Di * 3 + 2^Ki
This example combines all four sequences to show how they interrelate. The horizon line is the point where the series drops below the Seed.
i Ei Ki Ni Di 0 0 0 139 0 Seed = 139 1 1 1 209 1 (139*3^1 + 3^0*2^0) / 2^1 2 2 3 157 5 (139*3^2 + 3^1*2^0 + 3^0*2^1) / 2^3 -------------------------------------------- Horizon 3 3 6 59 23 (139*3^3 + 3^2*2^0 + 3^1*2^1 + 3^0*2^3) / 2^6 4 1 7 89 133 5 2 9 67 527 6 1 10 101 2093 7 4 14 19 7303 8 1 15 29 38293 9 3 18 11 147647 10 1 19 17 705085 11 2 21 13 2639543 12 3 24 5 10015781 13 4 28 1 46824559
Let L be the number of odd steps taken to reach the horizon. Should NL = Seed the sequence lands right on the horizon and forms a loop. Otherwise, the sequence dips below the horizon and eventually terminates. Going forward we'll refer to L odd steps of the Collatz sequence as a run.
Seed*3^L + DL NL = Seed = ------------- 2^KL Seed * 2^KL = Seed * 3^L + DL 2^KL = 3^L + DL / Seed
In the last equation we see that Seed has to evenly divide DL. Also, when (DL / Seed) is less than 3^L then 2^KL is close to 3^L. Although close is a fuzzy term, it'll be refined in the next section. These constraints are used in Appendix C to find loops in the 3*n-1 sequence and Appendix D covers 3*n+C sequences.
Constraints: 2^KL = 3^L + DL / Seed Seed divides DL 3^L is close to a power of 2
In this section we look at why powers of 3 with many leading ones is related to loops in the sequence. First we devise a contrived example to illustrate what arithmetic for a cycle would look like.
Seed * 2^KL = Seed * 3^L + DL Seed * (2^KL - 3^L) = DL Seed = DL / (2^KL - 3^L)
When 3^L has many leading one bits then (2^KL - 3^L) gets smaller and the size of Seed increases.
For a Seed of 27 the actual value of D29 = #6_954f_e21e_3e81, but here we substitute the value that would be needed to form a loop.
Seed = 27 D29 = #2ab0_1de1_c17f ~ Required value ~ 2^Ki = 3^i + Di / Seed Seed divides D29: #2ab0_1de1_c17f % Seed = 0 D29 / Seed = #194_bebc_826d 3^29 is close to 2^46: 3^29 = #3e6b_4143_7d93 2^46 - 3^29 = #194_bebc_826d 2^K29 = 3^29 + D29 / Seed = 3^29 + #2ab0_1de1_c17f / 27 2^46 = 3^29 + #194_bebc_826d / #1d The width of the numeric values are: Seed = #1b 5 bits 3^29 = #3e6b41437d93 46 bits with 5 lead one bits D29 = #2ab01de1c17f 46 bits D29 / Seed = #194bebc826d 41 bits #3e6b41437d93 3^29 + #194bebc826d D29 / Seed ------------- #400000000000 2^46
The upper 5 bits of 3^29 are all ones. This is what happens when a power of 3 is just a little under a power of 2.
When D29 is divided by the Seed the result is 5 bits shorter than D29. Adding (D29 / Seed) to 3^29 has a carry that flips the upper 5 one bits of 3^29 to zeros so that the result is a power of two.
The number of leading ones on a power of 3 serves as a metric for how close it is to a power of 2. The number of leading ones is small even for large exponents. Since the exponent is the same as the run length, the small number of leading ones limits the width of candidate Seeds. For such a small run length the sequence will go below the horizon and converge well before reaching the required length. Eliahou[2] formally proves "that the inverse proportion of odd elements in a cycle is very close to log2,(3)".
Since candidate runs have several leading binary one digits of a power of three, lets find them first. This example lists exponents resulting in 5 lead one bits.
Exponent - Powers of 3 with 5 leading ones Interval - The Exponent minus the previous exponent Lead Digits - The top lead digits of 3^Exponent Exponent Interval Lead digits of 3^Exponent 29 #3e6b41437d93 82 53 #3e8ca816be3ddb89e ... 135 53 #3eae20c9b77961715 ... 188 53 #3ecfab65f9d556fd1 ... 229 41 #7c30cdf4df1dc9ea0 ... 241 12 #3ef147f51affc7ea7 ... 282 41 #7c7342ef43a36a0a5 ... 335 53 #7cb5d_b79a5ff50d3 ... 388 53 #7cf897a70decc5280 ... 441 53 #7d3b778a8d56046c7 ... 494 53 #7d7E7b374059b58f3 ... 535 41 #f820a47e15c1033d4 ... 547 12 #7dc1a2c04d505eff5 ... 588 41 #f8a56baf0ed806d4f ... 641 53 #f92a79ed6922ff0a1 ... 694 53 #f9afcf5f2a23facf3 ... 747 53 #fa356c2a6bb5a258d ... 800 53 #fabb50755c161a57b ...
Notice that for a given number of one bits the exponents increase regularly, but with three different intervals. Also the largest interval, 53, is the sum or the other two intervals. This pattern occurs with powers of 3 with a fixed number of leading ones. The three intervals differ though depending on the number of leading ones.
Here we are only concerned with the lowest exponent; which is not distributed smoothly. The regularity of intervals for larger exponent just shows that run lengths are not completely chaotic.
This next list has the first power of 3 that have a number of leading ones. After a few weeks the computer spit out 42 and the mice have been dutifully informed. If a loop existed it would have to have a run length with many lead ones in 3^Length. This considerably narrows down candidate run lengths.
Lead Lowest Exponent for Lead Lowest Exponent for Ones 3^Exponent with Ones Ones 3^Exponent with Ones 1 2 21 762_148 2 3 22 381_074 3 10 23 190_537 4 5 24 32_343_822 5 29 25 21_562_548 6 41 26 10_781_274 7 147 27 118_212_940 8 253 28 343_857_546 9 306 29 171_928_773 10 1_636 30 1_987_866_895 11 8_951 31 1_192_720_137 12 12_276 32 795_146_758 13 14_271 33 397_573_379 14 31_202 34 19_760_456_010 15 15_601 35 13_173_637_340 16 47_468 36 6_586_818_670 17 158_670 37 72_057_431_991 18 79_335 38 412_584_135_936 19 2_858_055 39 275_056_090_624 20 1_524_296 40 137_528_045_312 41 3_149_971_404_836 42 4_656_193_084_598
For this analysis we are going to use the width of values in bits. Arithmetic is performed on positive integer numbers. Their width is given by the following function.
width( integer ) = [log2( integer )] + 1
The width of a power of 3 can also be computed using logarithms.
width( 3^L ) = [L * ln(3) / ln(2)] + 1 = [L * log2(3)] + 1
The sequence D depends on the Ki values which are the number of even steps following each odd step. They cannot be determined algebraically resulting in the hailstone effect. Instead we derive bounds for the sequence after L steps.
L DL = ∑ 3^(L-j) * 2^Kj-1 j=1
As the terms progress the powers of 3 are decremented in each step while the powers of two increase. They each balance out so that all the terms are of the same magnitute. A balance is maintained because the powers of two are limited by the K sequence. If a Ki value is too big then the corresponding value in the Collatz series would dip below the horizon and never reach the full run length.
A maximum bound for DL can be contrived by setting Ki values so they produce a large value. This is done by setting them to their maximum value early in the run. Since the powers of 3 are biggest early in the run, these terms get even larger with higher Ki values.
As the series progresses the Ki values are the number of even steps and the index is the number of odd steps. To stay above the horizon then 3^j must be greater than 2^Kj. Here is an algorithm that produces maximum values of Ki to determine a maximum bound for DL:
E = 0 K = 0 Dmax = 3^(L-1) DO j = 2 to L E = [log2( 3^(J-1) )] - K K += E Dmax += 3^(L-j) * 2^K -
The first term of the series, 3^(L-1), is always the largest. In each step the power of 3 is reduced by 3^j. Since the value of 2^Kj has to be under 3^j we know that:
3^(L-j) * 3^j > 3^(L-j) * 2^Kj-1 3^(L-1) > 3^(L-j) * 2^Kj-1 L DL = ∑ 3^(L-j) * 2^Kj-1 < L * 3^(L-1) j=1
The terms of the sum are close in magnitude so L * 3^(L-1) turns out to be a close upper bound on DL.
Using the same approach, a lower bound for DL can be made. In this case all values of Ej are one and Kj = j.
K = 0 Dmin = 3^(L-1) DO j = 2 to L K += 1 Dmin += 3^(L-j) * 2^K -
The lower bound produced by this algorithm can be calculated exactly as the sum of two parts. The first term, 3^(L-1), is the initial value. The remaining terms are calculated as the closed form of the geometric series.
Dmin = 3^(L-1) Initial term + 3 * 2^(L-1) * (1.5^(L-2) - 1) Geometric term
The sum of the remaining terms are computed in reverse order. The last term is 3 * 2^(L-1) and each prior term is 1.5 times as big.
T = 2^(L-1) Last term Sum = T*1.5 + T*1.5^2 + T*1.5^3 + ... + T*1.5^(L-1) Sum = T * {1 + 1.5 + 1.5^2 + 1.5^3 + ... + T*1.5^(L-1)}
The sum of the geometric series is then:
L-1 1 - 1.5^(L-1) 1 - 1.5^(L-1) ∑ 1.5^i = ------------- = ------------- = 1.5^(L-1) * 2 i=0 1 - 1.5 -.5 Sum = T * 1.5^(L-1) * 2 = 2^(L-1) * 1.5^(L-1) * 2 = { 3^(L-1) - 2^(L-1) } * 2 = 2 * 3^(L-1) - 2^L DL >= 3^(L-1) + 2 * 3^(L-1) - 2^L DL >= 3^L - 2^L
Together the bounds are:
3^L - 2^L <= DL < L * 3^(L-1)
These charts list bounds after running many Seeds that produced runs of a given length. Programmatically the bounds were validated for Seeds up to 6_000_000_000_000. It's important to remember that these bounds only apply to a loop with a given length and not to the value of the Collatz sequence in general.
L * 3^(L-1) Observed L Maximum Bound Maximum DL 10 #300de #169c9 15 #446bc17 #33e4817 20 #569860d1c #4069e6dd5 25 #66bf4ce0df9 #4997c31d84f 29 #25b62218c688d #195829cf3badf 30 #75091a5e8b73a #4b72433ff27fd 35 #819b94b3b36e8bb #50c4ef4133b9527 40 #8_c99eb99ccefec1b8 #5_33af21897da9a539 41 #1b_0594e128961c2d49 #f_b49c4be803b5a2cf 45 #962_4ddf75d38b4c1ddd #59d_d56ec09eed78837f 50 #9e5ae_21ae451cea477f16 #5edd6_78b6ac73a7ff33a1 55 #a5584b7_c4765d176ba6fedf #563376f_74f8d09154e81f4b 60 #ab376e2a8_2a911fe379a731d4 #570b66f89_7c32433470545e89 3^L - 2^L Observed L Minimum Bound Minimum DL 10 #e2a9 #18901 15 #da726b #1f603b7 20 #cfc41b91 #1f29d7af1 25 #c544562aa3 #28e69d42f4f 29 #3e6b21437d93 #b088507802db 30 #bb4183ca78b9 #2b9d16ab71fb1 35 #b1bf64d930979b #1a4445c586dabe3 40 #a8b8b352291fe821 #e28c7dbbac598c21 41 #1_fa2a1af67b5fb863 #1_ffdee34b9a06b863 45 #a0_275309fd09495753 #a0_baeb3fceb4355753 50 #9805_53ecdb2fd09de3c9 #982B_32dba1a9a4dde3c9 55 #904d0e_ad200e6305df37cb #911646_4dcfb2c26c9f37cb 60 #88f924ee_beeda7fe92e1f5b1 #8a3b121e_a02395040bfef5b1
We can plug the maximum D value into the constraint to get the maximum Seed value for a given loop length.
Seed * 2^KL = Seed * 3^L + DL Seed * (2^KL - 3^L) = DL Seed = DL / (2^KL - 3^L) Seed < DmaxL / (2^KL - 3^L) L * 3^(L-1) Seed < ----------- 2^KL - 3^L
The Seed is maximized when: KL = [log2( 3^L )] + 1 = width( 3^L )
Maximum L * 3^(L-1) Candidate < ----------- L KL Seed 2^KL - 3^L 5 8 31.15 10 16 30.34 20 32 28.76 29 46 381.64 30 48 27.24 40 64 25.78 41 65 1_185.43 50 80 24.37 100 159 79.76 147 233 6_700.17 150 238 257.93 200 317 12_790.90 250 397 120.29 253 401 27_071.44 300 476 235.14 306 485 99_729.97 350 555 583.11 400 634 12_757.65 450 714 213.80 500 793 385.17 1_636 2_593 583_014.68 8_951 14_187 6_559_828.85 12_276 19_457 17_302_831.00 14_271 22_619 45_086_468.18 15_601 24_727 285_814_986.23 31_202 49_454 285_812_386.08 47_468 75_235 1_447_674_321.78 79_335 125_743 7_216_089_270.36 158_670 251_486 7_216_076_047.89
Using the constraints on the D sequence we can now derive an upper bound on the Seed. For a loop we need NL = Seed. Then the loop constraint can be applied to derive an upper bound for the Seed.
Seed*3^L + DL NL = Seed = ------------- 2^KL Seed * 2^KL = Seed * 3^L + DL 2^KL = 3^L + DL / Seed Seed = DL / (2^KL - 3^L)
Replacing DL with its bounds in turn bounds the Seed.
Dmin / (2^KL - 3^L) < Seed < Dmax / (2^KL - 3^L) 3^L - 2^L L * 3^(L-1) ------------ < Seed < ------------ (2^KL - 3^L) (2^KL - 3^L) where: KL = [log2( 3^L )] + 1
In order to align (2^KL - 3^L) below the leading ones in 3^L the denominator needs to be reduced by 2^Ones.
For example 3^29 has 5 leading ones and the initial maximum Seed is 301.
29 * 3^(29-1) 663426981193869 Seedmax < ------------- = ---------------- = 301 2^(46 - 5) 2199023255552 < (29 * 2^5) / 3 = 309
Due to arithmetic boundary conditions involving additive carries, in practice this bound needs to be doubled. For this example the final bound would be 618 instead of 309.
Seedmax < [(29 * 2^6) / 3] = 618
The Seed is limited to the upper bound in order to meet the constraint on forming a loop. This severely limits eligible Seeds and effectivly makes a loop impossible.
For large lengths the calculation of 3^L can be difficult. To simplify the arithmentic the 2^KL term is replaced by 3^L in the inequality. Since 3^L is slightly under 2^K the upper bound of the Seed will only slightly increase.
L * 3^(L-1) L * 3^(L-1) L * 2^Ones Seedmax < -------------- < -------------- = --------- 2^(KL) / 2^Ones 3^(L) / 2^Ones 3 L * 3^(L-1) L * 1.5^(L-1) L * 2^Ones Seedmax < -------------- < --------------------- = ---------- 2^(KL - Ones) 2^(KL - Ones - L + 1) 3 KL = [log2( 3^L ) + 1] log2( Seedmax ) + 1 = log2( L * 3^(L-1) ) - [log2( 2^(KL - Ones) )] + 1 = {log2( L ) + (L-1)*log2(3)} - {KL - Ones} + 1 = log2( L ) + (L-1)*log2(3) - {integer[log2( 3^L )] + 1} + Ones + 1 = log2( L ) + (L-1)*log2(3) - integer[L*log2(3)] + Ones
Since the Seed is doubled to handle carries the maximum simplifies as follows. Note that the limit on the Seed is is linear for lengths where 3^Length has a given number of leading one bits. Actual Seeds grow exponentially as run lengths increase.
Seedmax < {Length * 2^(Ones + 1)} / 3
The Seed bound relies on the mechanics of binary arithmetic. Rather than using an algebraic derivation we'll go through it again in binary. It should also make things more concrete and precise. First lay out the bit patterns in binary numbers.
2^KL = 3^L + DL / Seed Computation to be performed. Binary Value Description [ones | more bits] 3^L with lead ones + [ D / Seed ] DL / Seed ------------------ [1 | zeros | rest ] rest needs to be zero for a loop
Since DL is difficult to compute we can substitute it with DmaxL.
Binary Value Description [ones | more bits] 3^L with lead ones + [ D / 2^J ] Dmax / 2^{ width( 3^L) - Ones } ------------------- [1 | zeros | rest ] rest needs to be zero for a loop [ Dmax ] Maximum value of DL
This next set of diagrams show what happens when the alignment needs adjusting.
C = 0 The addition does not carry into the ones. [ 11111 0 xxxxxxx ] 3^L [ 1 0 yyyyyyy ] Dmax / Seedmax ^____ Align to here works only in this case C = 1 The addition carries into the ones. [ 11111 0 xxxxxxx ] 3^L [ 1 yyyyyyy ] Dmax / Seedmax is bigger ^____ Align to here gives the biggest Seedmax value
This example shows how sometimes Seedmax needs to be doubled to account for the carry. The run length is 41; for which 3^41 has 6 leading ones.
3^41 = #1_fa2a_1cf6_7b5f_b863 Dmax = #1b_0594_e128_961c_2d49 41 * 3^(41-1) Seedmax = 1749 41 * 2^(6+1) / 3 Dmax = 110110000010110010100111000010010100010010110000111000010110101001001 69 bits 3^41 = 11111101000101010000111001111011001111011010111111011100001100011 65 bits Dmax ---- = 11000000001001111011000011101011110010110100011111100100110 59 bits 1152 Sum = 100000000000101011010110100111110111011001101110011111011110001001 66 bits
In order to loop the entire result would have to be a power of two. While true, it is sufficient to show that the limit on the Seed is so small that it cannot lead to a long enough run. The maximum Seed of 1749 is still too small to produce a run length of 41. The smallest Seed that can do that is 4591.
The following table shows that there is no loop whose length is under 50 million where 3^L has a single leading one bit. Seeds under the maximum size of 2^25 can only produce runs with a maximum length of 188; far below the required length of 50 million.
width(3^L) = L * log2( 3 ) + 1 Dmax = L * 3^(L-1) width(Dmax) = log2( L ) + (L-1) * log2( 3 ) + 1 width(Smax) = width(Dmax) - width(3^L) + 1 Lmax: Maximum length for runs with Seeds up to Smax L Lead bits of 3^L width(3^L) width(Dmax) width(Smax) Smax Lmax 101 1001_0101_0110 161 166 6 63 37 201 1001_0001_1100 319 325 7 127 37 301 1001_0111_1100 478 484 7 127 37 400 1001_0111_0010 634 642 9 511 37 500 1010_1000_0110 793 800 8 255 37 1000 1001_1110_0001 1585 1594 10 1023 51 5000 1001_0001_1100 7925 7936 12 4095 51 10000 1011_0111_1000 15850 15862 13 8191 51 50000 1000_1101_0101 79249 79263 15 32767 66 100000 1010_1011_0001 158497 158512 16 65535 85 500002 1011_0011_1011 792485 792502 17 131071 85 1000001 1001_0011_1001 1584965 1584983 19 524287 109 5000001 1010_0000_1001 7924815 7924835 21 2097151 141 10000001 1001_1011_1101 15849627 15849649 23 8388607 155 50000001 1011_1001_1001 79248127 79248151 25 33554431 188
This next chart relates the lowest power of 3 with a number of leading ones to the largest Seed allowed that meets the alignment constraint. Trivial values are excluded. The width of the maximum Seed is also listed. It fits linearly to: 2.0003 * Ones + 26.746
Computations by Barina[1] have shown that Seeds under 68 bits terminate at one. From the following table, Seeds for loops with under 6_586_818_670 odd values are coverd . We now know there can be no loops with a run length that is shorter. Previously Eliahou[2] showed that loops must contain at least 17_087_915 elements; for which he includes even sequence values.
Lowest Maximum Seed To Maximum Leading 3^Power = Align with Ones Seed Ones Run Length Length*2^(Ones+1)/3 Width 5 29 618 10 6 41 1749 11 7 147 12544 14 8 253 43178 16 9 306 1_04448 17 10 1_636 11_16842 21 11 8_951 122_21098 24 12 12_276 335_21664 25 13 14_271 779_38688 27 14 31_202 3408_09045 29 15 15_601 3408_09045 29 16 47_468 20739_08565 31 17 158_670 1_38647_96160 34 18 79_335 1_38647_96160 34 19 2_858_055 99_89626_26560 40 20 1_524_296 106_55601_34997 40 21 762_148 106_55601_34997 40 22 381_074 106_55601_34997 40 23 190_537 106_55601_34997 40 24 32_343_822 36175_95253_06368 49 25 21_562_548 48234_60337_41824 49 26 10_781_274 48234_60337_41824 49 27 118_212_940 10_57751_48180_00213 54 28 343_857_546 61_53570_47730_33984 56 29 171_928_773 61_53570_47730_33984 56 30 1_987_866_895 1422_97055_04710_10986 61 31 1_192_720_137 1707_56466_05652_13184 61 32 795_146_758 2276_75288_07536_17578 61 33 397_573_379 2276_75288_07536_17578 61 34 19_760_456_010 2_26321_36617_86577_30560 68 35 13_173_637_340 3_01761_82157_15436_40746 69 36 6_586_818_670 3_01761_82157_15436_40746 69 37 72_057_431_991 66_02332_02848_19022_15168 73 38 412_584_135_936 756_06842_48292_43028_93056 77 39 275_056_090_624 1008_09123_31056_57371_90741 77 40 137_528_045_312 1008_09123_31056_57371_90741 77 41 3_149_971_404_836 46179_06915_70544_51177_66314 82 42 4_656_193_084_598 1_36521_02500_49520_38789_17461 84
This table lists run lengths and the corresponding Seeds. We are primarilly interested in the width of the smallest Seed. Despite their chaotic nature, if their widths are plotted we get a linear trend line with little variance:
Average Minimum Required Seed Width: 0.0878 * Length + 14.447 Run Minimum Minimum Required Required Length Seed Seed Width 40 6887 13 41 4591 13 42 13439 14 43 6383 13 44 4255 13 45 7963 13 46 6383 13 47 12399 14 48 7279 13 49 1583 11 : : : 370 1220_47117_99967 44 371 281_35382_12167 42 372 2383_89422_84287 45 373 250_09228_55259 42 374 2872_96231_36495 45 375 2268_60535_77471 45 376 2553_74427_87995 45 377 1702_49618_58663 44 378 4299_29337_89863 46 379 3075_01027_27359 45
The maximum allowed Seed is far smaller than the required Seed. This means that a loop cannot be formed. The values also diverge. The maximum possible Seed width increases linearly and the larger required Seed width increases exponentially. As the number of leading ones increases the divergence increases exponentially so there is no hope of finding an aberant Seed even further out to form a loop.
Here are the combined results of the previous computations. Note that only the first 5 required Seed widths can be easilly computed. The remaining values are projected from the trend line for run lengths. For values of ones the required Seed with exceeds the run length / 12; which is much greater than the allowed Seed width.
Seedmax width is linear by Ones 2.0003 * Ones + 26.746 Seedmin width is linear by Length 0.0878 * Length + 14.447 Seedmin width is exponential by Ones Lowest Maximum Minimum 3^Power = Allowed Required Ones Run Length Seed Width Seed Width 5 29 10 12 Actual 6 41 11 13 Required 7 147 14 25 Widths 8 253 16 33 9 306 17 35 10 1_636 21 158 Projected 11 8_951 24 800 Required 12 12_276 25 1_092 Widths 13 14_271 27 1_267 14 31_202 29 2_754 15 15_601 29 1_384 16 47_468 31 4_182 17 158_670 34 13_946 18 79_335 34 6_980 19 2_858_055 40 250_952 20 1_524_296 40 133_848 21 762_148 40 66_931 22 381_074 40 33_473 23 190_537 40 16_744 24 32_343_822 49 2_839_802 25 21_562_548 49 1_893_206 26 10_781_274 49 946_610 27 118_212_940 54 10_379_111 28 343_857_546 56 30_190_707 29 171_928_773 56 15_095_361 30 1_987_866_895 61 174_534_728 31 1_192_720_137 61 104_720_842 32 795_146_758 61 69_813_900 33 397_573_379 61 34_906_957 34 19_760_456_010 68 1_734_968_052 35 13_173_637_340 69 1_156_645_373 36 6_586_818_670 69 578_322_694 37 72_057_431_991 73 6_326_642_543 38 412_584_135_936 77 36_224_887_150 39 275_056_090_624 77 24_149_924_771 40 137_528_045_312 77 12_074_962_393 41 3_149_971_404_836 82 276_567_489_359 42 4_656_193_084_598 84 408_813_752_842
Bounds were derived that are useful for understanding loops in the Collatz conjecture.
Calculations using these constraints show there are no loops with fewer than 6_586_818_670 odd elements. These constraints are consistent with observable behavior. As such they help explain what we are seeing. In turn this makes the constraints easy to validate or to find a counter example should something be mistaken.
We've shown that Seeds for long run lengths are limited to relatively small values. For larger values the upper bound on Seeds grows linearly while the corresponding run lengths have exponential growth. While this is compelling evidence that it is far harder to generate a very long run from such a small Seed, this is only an observation.
[1] Barina, David (2020). "Convergence verification of the Collatz problem". The Journal of Supercomputing. 77 (3): 2681-2688. doi:10.1007/s11227-020-03368-x. S2CID 220294340.
[2] Eliahou, Shalom (1991). "The 3x+1 problem: new lower bounds on nontrivial cycle lengths". Discrete Mathematics 118 (1993) 45-56 North Holland.
The majority of runs are trivialy short. The length of a short run is easily determinted by the low order bits of the Seed. When the low bits for different Seeds match, the run length will be the same regardless of the upper bits. However, in non-trivial runs the low order bits do not determine the run length.
Looking at the low 4 bits of a Seed we can tell the the length of trivial runs except for 7, #B, or #F. The short lengths are either 1 or 2.
Low bits: 1 3 5 7 9 B D F Length: 1 2 1 x 1 x 1 x
By constructing an array with entries of the non-trivial low bits you can generate Seeds that only have longer runs. In the example above 5 out of 8 runs are trivial. Note that if the Seed is even it is immediately divided by 2 and goes below the Seed. Consequnetly even Seeds can be discarded out of hand.
Here is the pyramid with low order bits (in hexadecimal) for non-trivial runs ranging from 1 to 8 bits. Each branch to the left insert a one bit to the front of the value. Right branches insert a leading zero. The leaf nodes at the bottom are the low 8 bits of Seeds needed to for a long run.
1 | ____________ 3 ____________ / \ ___________ 7 ___________ 3 / \ | _________ f _________ 7 b / \ | | ___ 1f ___ _ f _ _ 7 _ 1b / \ / \ / \ / \ 3f 1f 2f f 27 7 3b 1b / \ / \ / \ | / \ | / \ | 7f 3f 5f 1f 6f 2f 4f 67 27 47 7b 5b 1b / \ / \ | / \ / \ | | / \ | | / \ | | ff 7f bf 3f df 9f 1f ef 6f 2f cf e7 67 a7 47 fb 9b 5b 1b
To produce a series of candidate Seeds you can combine low Seed bits with upper bits of your choosing. The low bits would be in the set of values for non-trivial runs.
An array of 20 bit low order values for long runs has 27328 elements. Using it to create Seeds eliminates 94.8% of values that are trivial. This was used to speed up exhaustive searches in experimental runs. To generate Seeds without trivial runs use nested loops as below using an array of low bit values named Long.Run.
DO Upper = 0 to Limit: DO over Seeds (may start and end anywhere), DO @Lower in Long.Run: DO over an array of low bits, Seed = (Upper * 2^20) + Lower; Seed with a non-trivial length < Process the Seed here as you wish. > - -
To visualize navigating the tree imagine driving through it from the top down. At each trivial branch there is a stop light and only appears some of the time on the left branches. The bits of the Seed value from low to high guide you through the tree. Turn right on a one and turn left on a zero. choose the right Seed and you'll go through all greens.
After each branch there are fewer green lights and a higher percentage of red lights. Since the Seed is finite you are going to run out of bits to guide you. Then you are out of control and there are more and more red lights. To avoid them you need to know where they are, but you don't have that information. Instead each turn is determined by bumps in the road. Eventually you are going to reach a red light.
This table compares the smallest seeds yielding ever longer run lengths to the maximum seed permitted to form a loop. The maximum seed allowed is far far smaller than the actual seed. This shows why non-trivial loops cannot be formed. Computations were sped up by filtering out Seeds for trivially short runs.
Maximum Allowed Seed <= {Length * 2^(Ones + 1)} / 3 Actual Length Lead Bits Lead Allowed Seed of 3^Length Ones Seed 27 37 1100_0111 2 98 703 51 1011_0100 1 68 10_087 66 1101_1011 2 178 35_655 85 1011_1001 1 113 270_271 103 1010_0000 1 137 362_343 104 1111_0000 4 554 381_727 109 1100_0011 2 290 626_331 111 1110_0001 3 1109 1_027_431 115 1011_1001 1 153 1_126_015 141 1001_1000 1 188 8_088_063 155 1111_1000 5 3306 13_421_671 181 1111_1011 5 3861 20_638_335 184 1010_0010 1 245 26_716_671 188 1101_1011 2 501 56_924_955 194 1111_1000 5 4138 63_728_127 237 1000_0100 1 316 217_740_015 249 1011_0001 1 332 1_200_991_791 251 1010_0101 1 334 2_788_008_987 282 1010_1000 1 376 12_235_060_455 345 1000_1010 1 460 898_696_369_947 347 1110_0001 3 1850 2_081_751_768_559 382 1110_0110 3 2037 13_179_928_405_231 434 1010_1000 1 578
After completing many runs we find that certain Seed values come closest to reaching themselves. The first thing that stands out is that as Seeds get larger the size of any near misses gets larger. That is, the amount by which runs miss the horizon increases.
To compensate for ever increasing misses, instead of looking at absolute values we divide the Seed by the miss. To find misses that are relatively close we take the ratio of the Seed to the miss just below the horizon. It turns out that the ratio is useful because it characterizes the way misses behave.
Ratio = Seed / (Seed - NL)
Nearly all misses are relatively large and the ratio is small (< 10). On some runs though a few ratios are noticably larger. This happens when any value of 3^L is near a power of 2. First lets look at ratios for runs with 29 steps.
Seed NL < Seed Miss Seed / (Seed - NL) 3431 3349 82 41.84146 4575 4465 110 41.59090 17919 17479 440 40.72500 20639 20131 508 40.62795
As the Seeds get larger we still get some 40.+ ratios, but in this case the first ratio is highest. Next lets list the highest ratios found for various run lengths. Ratios for all other lengths in this region were much smaller.
L = Length Seed NL < Seed Miss Seed / (Seed - NL) 41 4591 4541 50 91.82000 94 432923 428885 4038 107.21223 147 50726683 50358403 368280 137.73944 200 495828479 493257605 2570874 192.86378 253 6250517663 6231106441 19411222 322.00537 306 1152923546776074687 = Seed 978.74477 - 1151745585392149697 = NL < Seed ------------------- 1177961383924990 = Miss
Note that in all these cases three raised to the power of the Length has several leading one bits.
Length: 41 94 147 200 253 306 Lead ones in 3^Length: 6 6 7 7 8 9
To see how the loop constraints can be met in a loop we look to the 3*n-1 variation of the Collatz series. In addition to terminating at one this version has two small circularities.
N is Even: N' = N / 2 N is Odd: N' = (3 * N - 1) / 2 Seed Values 5 14, 7, 20, 10, 5, ... 17 50, 25, 74, 37, 110, 55, 164, 82, 41, 122, 61, 182, 91, 272, 136, 68, 34, 17, ...
Next we plug in the values for these cycles into the revised series. Note that in this case the addition in the sequence is flipped to subtraction. This cuases the sign in the D series to flip.
i Ei Ki Ni Di Seed = 5 0 0 0 5 0 1 1 1 7 -1 2 3 4 5 -5 Seed divides D: D2 = -5 = 5 * -1 Close powers: 2^3 = 3^2 + D2 / Seed 8 = 9 - 1 i Ei Ki Ni Di Seed = 17 0 0 0 17 0 1 1 1 25 -1 2 1 2 37 -5 3 1 3 55 -19 4 2 5 41 -65 5 1 6 61 -227 6 1 7 91 -745 7 4 11 17 -2363 Seed divides D: D7 = -2363 = 17 * -139 Close powers: 2^11 = 3^7 + D7 / Seed 2048 = 2187 - 139
There are also small loops in varient sequences where odds transition to 3n+C; where C is a positive odd constant and not a multiple of 3. In this case the loop constraints are revised as follows.
Seed * C * 2^KL = Seed * 3^L + C * DL 2^KL = 3^L + C * DL / Seed
3*N+19: Seed = 5 C = 19 Loop = {5, 17, 35, 31, 7} N K D 5 0 0 17 1 1 35 2 5 31 4 19 5 * 2^11 = 10240 = 5 * 3^5 + 475 7 8 65 5 11 475 2^11 = 3^5 + 19 * 475 / 5 = 2048
Using the equality, here are some loops for C under 100:
3*N+C 2^K = 3^L + C*DL / Seed Loop values 3*N+ 5: 2^27 = 3^17+5*189900931/187 187 283 427 643 967 1453 1091 1639 2461 1847 2773 2081 781 587 883 1327 1993 3*N+19: 2^11 = 3^5 +19*475/5 5 17 35 31 7 3*N+23: 2^5 = 3^2 +23*7/7 7 11 3*N+25: 2^27 = 3^17+25*189900931/935 935 1415 2135 3215 4835 7265 5455 8195 12305 9235 13865 10405 3905 2935 4415 6635 9965 3*N+25: 2^27 = 3^17+25*352383011/1735 1735 2615 3935 5915 8885 3335 5015 7535 11315 16985 12745 9565 1795 2705 2035 3065 2305 3*N+29: 2^17 = 3^9 +29*42251/11 11 31 61 53 47 85 71 121 49 3*N+29: 2^65 = 3^41+29*55258418497115015011/3811 3811 5731 8611 12931 19411 29131 43711 65581 49193 18451 27691 41551 62341 46763 70159 105253 78947 118435 177667 266515 399787 599695 899557 674675 1012027 1518055 2277097 853915 1280887 1921345 180127 270205 202661 152003 228019 342043 513079 769633 36077 27065 10153 3*N+35: 2^8 = 3^4 +35*85/17 17 43 41 79 3*N+47: 2^7 = 3^4 +47*85/85 85 151 125 211 3*N+49: 2^38 = 3^22+49*124233085375/25 25 31 71 131 221 89 79 143 239 383 599 923 1409 1069 407 635 977 745 571 881 673 517 3*N+53: 2^29 = 3^17+53*792382399/103 103 181 149 125 107 187 307 487 757 581 449 175 289 115 199 325 257 3*N+55: 2^5 = 3^3 +55*19/209 209 341 539 3*N+59: 2^10 = 3^6 +59*665/133 133 229 373 589 913 1399 3*N+65: 2^24 = 3^12+65*4748765/19 19 61 31 79 151 259 421 83 157 67 133 29 3*N+67: 2^30 = 3^16+67*261519653/17 17 59 61 125 221 365 581 905 1391 265 431 85 161 275 223 23 3*N+71: 2^10 = 3^5 +71*341/31 31 41 97 181 307 3*N+79: 2^23 = 3^14+79*12094865/265 265 437 695 541 851 329 533 839 649 1013 1559 1189 1823 1387 3*N+83: 2^12 = 3^7 +83*2507/109 109 205 349 565 889 1375 263 3*N+85: 2^100 = 3^56+85*104351656096075462196663857141/7 7 53 61 67 143 257 107 203 347 563 887 1373 1051 1619 2471 3749 2833 1073 413 331 539 851 1319 2021 1537 587 923 1427 2183 3317 2509 1903 2897 1097 211 359 581 457 91 179 311 509 403 647 1013 781 607 953 23 77 79 161 71 149 133 121 3*N+89: 2^17 = 3^8 +89*23783/17 17 35 97 95 187 325 133 61 3*N+91: 2^12 = 3^6 +91*2183/59 59 67 73 155 139 127
Here are two longer loops with a lengths of 336 and 366:
::::::: 3*N+3365 Seed = 203 1987 4663 8677 7349 6353 2803 5887 10513 4363 8227 14023 22717 17879 28501 22217 547 2503 ... 10241 4261 4037 3869 3743 7297 3157 3209 2^624 = 3^336 + 3365 * 4199796658507963543348192033850161584917117175526301353269734434329173562120419354780413672772415171801864233399236023196347164276301311627415122547557045765655556868486257128575741096209 / 203 ::::::: 3*N+4351 Seed = 245 2543 2995 1667 1169 3929 8069 14279 11797 19871 15991 13081 21797 34871 27241 43037 66731 799 ... 49705 76733 117275 22261 35567 27763 10955 1163 2^672 = 3^366 + 4351 * 1103402814167822474881382071325480137263443016358074802724808612970327427257831590024986064792407378610873873845382384566581274222795067539453507073520344455356045309389679513583085955741034037736313165 / 245