Project Euler Problem 3

Jun 13th, 2022

11 min read

Largest Prime Factor

Project Euler presents the following Problem #3:

The prime factors of 13195 are 5, 7, 13 and 29.

What is the largest prime factor of the number 600851475143 ?

Definitions
Fundamental Theorem of Arithmetic: Every positive integer greater than 1 will have a unique representation as a product of prime numbers found through prime factorization.
Prime Factorization: The decomposition of a positive integer greater than 1 into a product of smaller prime numbers.
Prime Number: A positive integer greater than 1 that is only divisible by 2 numbers: 1 and the integer itself.
Composite Number: A positive integer greater than 1 that is not prime, in that it has factors other than 1 and the integer itself.

The simplest solution to find the prime factors of a number $n$ involves repeatedly dividing out all factors (trial division) until the maximum possible factor is reached. The maximum factor is determined by $\lfloor\sqrt{n}\rfloor$ based on the principle of cofactors, so there can be at most one prime factor greater than this maximum value, which would be $n$ itself if $n$ is prime.

The Unit 1

The solutions do not include the number 1 as the number 1 is considered neither composite nor prime and has no prime factors, so, under current convention, it cannot undergo prime factorization. 1 is the only factor of 1 and 1 is the only number that has only 1 factor.

Here's a lovely read on why 1 isn't a prime number.

I've chosen to store the prime factors in a List and return the final (largest) prime found, which could be simplified even further. This will look different compared to the version of this helper function in my GitHub repository as that version has been adapted to meet the needs of future problem solutions by storing the prime factors & their counts in a Map.

1import kotlin.math.sqrt
2
3fun largestPrimeFactor(n: Long): Long {
4    require(n > 1) { "Must provide a natural number greater than 1" }
5    var num = n
6    val primes = mutableListOf<Long>()
7    val maxFactor = sqrt(num.toDouble()).toLong()
8    val factors = listOf(2L) + (3L..maxFactor step 2L)
9    for (factor in factors) {
10        while (num % factor == 0L) {
11            primes.add(factor)
12            num /= factor
13        }
14    }
15    if (num > 1) primes.add(num)
16    return primes.last()
17}

Line 8: After the smallest prime number 2, every other prime number will be odd, so we can step by 2 for every factor.

Line 15: If $n$ is composite and the loop divides out all prime factors, num will equal 1 when this line is reached. If $n$ is prime, num will not reach 1 as its second factor ( $n$ itself) will not have been included in the loop, so num will equal $n$ .

Here's an example of the resulting List when largestPrimeFactor(40) is called:

\begin{align*} \frac{40}{2}&=20\quad\mapsto\quad[2]\\[1em] \frac{20}{2}&=10\quad\mapsto\quad[2,\;2]\\[1em] \frac{10}{2}&=5\quad\mapsto\quad[2,\;2,\;2]\\[1em] 2&\nmid5\\[1em] 3&\nmid5\\[1em] \frac{5}{5}&=1\quad\mapsto\quad[2,\;2,\;2,\;5] \end{align*}

Where Did The Floor Go?

I told you that we'd be using $\lfloor\sqrt{n}\rfloor$ to set the upper loop limit and, yet, there's no floor() in largestPrimeFactor(). The kotlin.math package does provide this function, taking a Double value and returning a Double that has been rounded down towards negative infinity.

So why didn't I use that?

Because Kotlin's type cast toLong() already floors the value by truncating its fractional part. No import statement required either. I will occasionally need to import floor() when working solely with floating-point types; otherwise, casting to Int or Long does the trick.

Don't confuse casting with the other kotlin.math package function roundToLong() that rounds the floating-point value to the nearest integer, which may be towards negative or positive infinity.

Here's a quick codeblock to compare the three functions mentioned in this snippet:

1import kotlin.math.floor
2import kotlin.math.roundToLong
3
4val doubles = listOf(1.0, 1.1, 1.2, 1.3, 1.4, 1.5, 1.6, 1.7, 1.8, 1.9)
5
6val flooredDoubles: List<Double> = doubles.map(::floor)
7val castedToLong: List<Long> = doubles.map(Double::toLong)
8val roundedToLong: List<Long> = doubles.map(Double::roundToLong)
9
10println("Using floor() -> $flooredDoubles")
11println("Using toLong() -> $castedToLong")
12println("Using roundToLong() -> $roundedToLong")

Using floor() -> [1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0]
Using toLong() -> [1, 1, 1, 1, 1, 1, 1, 1, 1, 1]
Using roundToLong() -> [1, 1, 1, 1, 1, 2, 2, 2, 2, 2]

As an extra aside, did you notice the different syntax used for the first map()? The function references look different becausetoLong() and roundToLong() are extension functions that are called on a value, while floor() takes a value as its argument. The parentheses could be omitted entirely from all three invocations and replaced with trailing lambda expressions to better emphasize this difference.

1val flooredDoubles: List<Double> = doubles.map { floor(it) }
2val castedToLong: List<Long> = doubles.map { it.toLong() }
3val roundedToLong: List<Long> = doubles.map { it.roundToLong() }

Direct Search Factorization - Simplified

So you've found a solution that gives the correct answer, but there was lot going on in that function.

We propagated a list of every prime factor found, even duplicates, using another list of factor candidates, when all we really needed was the final factor found. We also only found the square root of the original argument, even though, as factors get divided out and the original number decreases in value, the same cofactor principle applies to the decreasing number.

This simplified function checks that each factor squared (instead of calculating the square root) does not exceed the current value of the decreasing number and terminates if it does, as this would mean that the current value is a prime and the largest factor. It will also catch this in the second while loop (line 5) if the factor increases enough to equal the current value of the decreasing number, for the same reason.

1fun largestPrimeFactorSimple(n: Long): Long {
2    var num = n
3    var factor = 2L
4    while (factor * factor <= num) {
5        while (num % factor == 0L && num != factor) {
6            num /= factor
7        }
8        factor++
9    }
10    return num
11}

Here's what happens when largestPrimeFactorSimple(40) is called:

num=40\quad factor=2\\[1em] 2^2\le40\\[1em] num=\frac{40}{2}=20\\[1em] num=\frac{20}{2}=10\\[1em] num=\frac{10}{2}=5\\[1em] 2\nmid5\quad\to\quad factor=3\\[1em] 3^2>5\\[1em] largest=num=5

This is a significantly less verbose function than the first solution, and maybe it loses a bit of clarity if you're looking at it for the first time. Yes it could also be adjusted to have two loops: one that handles the smallest prime 2, then a second that handles only odd numbers by incrementing by 2. Yes this would mean cutting the potential factors (and undesired composites) in half, but spoiler alert ⚠️, it already gives a performance boost as it is now.

Sieve of Eratosthenes

We looked at the super simple solution, now let's go in the opposite direction.

In the first trial division solution largestPrimeFactor(), the loop is optimized to only iterate through odd numbers, but many of the factors will not be prime. It is not necessary to perform a prime check as composites will always fail the conditional branch of the nested while loop (line 10). This is because all primes smaller than the composite will already have been checked, therefore all prime factors of that composite will also have been checked. This means the composite itself cannot be a factor as all its prime factors would have already been divided out of $n$ .

If you do want to avoid looping through any composites, you can use the Sieve of Eratosthenes algorithm to generate all prime numbers $\le\sqrt{n}$ . Understanding this algorithm from now will come in handy for future Project Euler solutions.

See a visual representation of how the sieve works.

1import kotlin.math.sqrt
2
3fun primeNumbers(n: Int): List<Int> {
4    if (n < 2) return emptyList()
5    val oddSieve = (n - 1) / 2
6    val boolMask = BooleanArray(oddSieve + 1) { true }
7    for (i in 1..(sqrt(1.0 * n).toInt() / 2)) {
8        if (boolMask[i]) {
9            var j = 2 * i * (i + 1)
10            while (j <= oddSieve) {
11                boolMask[j] = false
12                j += 2 * i + 1
13            }
14        }
15    }
16    return boolMask.mapIndexed { i, isPrime ->
17        if (i == 0) 2 else if (isPrime) 2 * i + 1 else null
18    }.filterNotNull()
19}

Line 5: The classic algorithm creates a boolean mask of all numbers $\le n$ but in this version we only allocate mask memory to include odd numbers (half the space).

Line 6: The boolean mask created represents numbers [2] + [3..n step 2]. For example, index 1 represents the number 3 based on the formula $2*1+1$ and index 2 represents the number 5 based on the formula $2*2+1$ .

Line 7: boolMask[0] represents the number 2 and is skipped by the loop's lower limit. The upper limit is set by $\sqrt{n}$ and then divided by 2 as the mask only includes odd numbers.

Line 8: boolMask[i] == true means that the odd number represented by that index is a prime.

Line 9: j corresponds to the next index at which a multiple of the odd prime exists.

Line 11: Mark these multiples (composites) as false.

Line 16: Transform the boolean mask to a list by filtering out false values and converting remaining indices to their corresponding prime numbers.

The Classic Sieve of Eratosthenes

Here's the code for the algorithm that uses a full boolean mask.

1import kotlin.math.sqrt
2
3/**
4 * @return all prime numbers less than or equal to [n]
5 */
6fun primeNumbers(n: Int): List<Int> {
7    if (n < 2) return emptyList()
8    val boolMask = BooleanArray(n + 1) { true }
9    for (p in 2..sqrt(1.0 * n).toInt()) {
10        if (boolMask[p]) {
11            if (p * p > n) break
12            for (m in (p * p)..n step p) {
13                boolMask[m] = false
14            }
15        }
16    }
17    return boolMask.mapIndexed { i, isPrime ->
18        if (i > 1 && isPrime) i else null
19    }.filterNotNull()
20}

Both are quite good, but I've found a decent enough performance boost with the optimized version to make it my go-to. Here's a benchmark for the two algorithms finding all prime numbers less than or equal to 1 million over 1000 iterations.

ALGORITHM	BEST (ms)	AVERAGE (ms)	WORST (ms)
Classic Sieve	11.34	19.87	211.66
Odd Sieve	6.05	12.32	110.07

The values returned by primeNumbers() can then be used as the factors that are divided out of $n$ , with all other parts of the original function remaining the same.

1import kotlin.math.sqrt
2
3fun largestPrimeFactorSieve(n: Long): Long {
4    require(n > 1) { "Must provide a natural number greater than 1" }
5    if (n == 2L) return n
6    var num = n
7    val primes = mutableListOf<Long>()
8    val maxFactor = sqrt(num.toDouble()).toInt()
9    val factors = primeNumbers(maxFactor)
10    for (factor in factors) {
11        while (num % factor == 0L) {
12            primes.add(1L * factor)
13            num /= factor
14        }
15    }
16    if (num > 1) primes.add(num)
17    return primes.last()
18}

Line 5: Since $\sqrt{n}$ is passed as an argument to primeNumbers(), n == 2L will result in an empty list unless it is caught as an edge case.

Speed Comparison

So how do the three solutions compare against each other?

When running benchmarks, for curiosity's sake, I used two different types of composites:

Small factor composites, like 1e12 that has a prime factorization of $2^{12}*5^{12}$ .
Large factor composites, like 600851475143 that has only 4 large factors.

	TRIAL DIVISION
N	ORIGINAL	SIMPLIFIED	SIEVE
1e12	58.90ms	3073ns	14.72ms
600851475143	39.55ms	2.6e+04ns	10.79ms

Based on the average execution time over 1000 iterations, the simplified solution had the best performance on my system. It iterated over every factor starting at 2 and, regardless of whether the largest factor was as small as 5 or much larger than that, it still surpassed the optimized options.

We by no means had to create a prime number generator to solve this problem and there is definitely such a thing as over-engineering a solution. But the function using the sieve performed better than the original function and you've gotten a preview at a useful algorithm that we'll already be revisiting (albeit for fun not necessity) in Problem #5.

Here's the full source code for this problem (and its corresponding test suite).

This solution set is also available in Python and in C++ on my GitHub.

Problem content on the Project Euler site is licensed under a Creative Commons (CC) License: CC BY-NC-SA 4.0. Solutions to only the first hundred problems are encouraged to be distributed publicly.

Thanks for reading!🏝️🎭💡