Introduction to Coding Agents

SatoshiTerasaki@AtelierArith

Overview

Self-introduction
Introducing to AI-powered software
Introducing to AI agents (Cursor etc.)
(~~Live Coding~~ Live Generating code exercises with several topics)

Self-introduction

Name: Satoshi Terasaki
- Freelance software engineer.
GitHub: terasakisatoshi
Organization: AtelierArith
- Publishes Julia packages I have created.

Actually

The software I will introduce leverages AI.
- Over 90% of the implementation I was responsible for was generated by AI. This is a fact.
I like dynamic languages like Julia and was not good at compiled languages.
- Thanks to AI, I can now use programming languages other than Julia:
  - Rust
  - C++
  - Swift
  - Dart
  - TypeScript

Recent work: Organizing SparseIR ecosystem

Collaborative development with Hiroshi Shinaoka, Hitoshi Mori, Markus Wallerberger
- Porting SparseIR.jl v1.0 to C++: libsparseir (C++)
- Porting libsparseir to Rust: sparse-ir-rs
- SparseIR.jl v2.0 (Julia interface for sparse-ir-rs)
- sparse-ir v2.0 (Python interface for sparse-ir-rs)
See also Why I Migrated from C++ to Rust (Development Experience in the Age of Generative AI) by Hiroshi Shinaoka

Recent work: Development with `tensor4all` members

tensor4all-rs: A Rust implementation of tensor networks
- Porting Jutho/Strided.jl to Rust strided-rs with Hiroshi Shinaoka
- strided-rs-benchmark-suite
tenferro-rs
- A general-purpose tensor computation library in Rust with CPU/GPU support.

Recent work: `mVMC`

Takahiro Misawa et al.
- Porting C project mVMC to Julia (WIP)

We could achieved better execution speed than the original C implementation.
This project is still work in progress. It will be released in the future.

Recent work: SubsetJuliaVM

Let’s run Julia on iPad and iPhone

Recent work: RustCall.jl

A package that allows calling Rust from Julia. Its design was inspired by Cxx.jl.

using RustCall

rust"""
#[julia]
fn add(a: i32, b: i32) -> i32 {
    a + b
}
"""

# Call directly - wrapper is auto-generated
add(10, 20)  # => 30

Recent work: better-pluto-client

We can view Pluto notebook on VS Code.

AI Driven Development

Thanks to AI, I can now use programming languages other than Julia and develop many software products rapidly.
We can give instructions in natural language for what we want to build, and AI agents generate source code and execute commands.
We call this development style as AI Driven Development.

Did you always take it seriously?

Were you always thinking about efficient development methods?

Really? (Especially writing tests and documentation)

In reality

Where is the source code? (Git is not used. No version control)

No documentation, or it’s outdated

Code cannot compile (C/C++/Fortran90/FORTRAN77)

No unit tests

Runtime errors occur (segmentation faults, etc. C++!!!)

Dataset URLs point to a researcher’s personal C: drive
- (hardcoded variables in the SOTA machine learning source code)

Build reproducible software for us!

When researchers develop software, it is important to build reproducible software.
AI-powered software development is one solution for building reproducible software.

Humans and machines (1)

Why real-world software development is so far from ideal.
Humans are creative beings
- Researchers surely have wonderful ideas every day.
However, they “get tired,” “get bored,” and “get stubborn”
- Overusing git commit -m "update" and git commit -m "fix"
  - especially during CI/CD pipeline debugging

Sticking to a specific programming language
- This creates an environment where it is difficult to adopt modern development practices like using a package manager.

Humans and machines (2)

This is my personal opinion

Use Julia or Rust for scientific computing.
- They have modern package management systems that enable collaborative development with others.

Julia provides fast execution while remaining a high-level language.

Rust is a fast static language. Those who write C/C++ should consider switching.
Rust has a learning curve when written manually, but AI code generation dramatically improves the development experience.
- I believe the tensor4all-rs ecosystem for tensor networks aims to be a very good example of using Rust for scientific computing.

The rise of coding agents

Switching and learning new technology requires paying a learning cost, but using AI agents solves this problem.

Humans and machines (3)

Machines “don’t get bored,” “don’t get tired,” and “AI models evolves”
- Generate appropriate commit and pull request summaries from code diffs

Humans and machines (4)

Machines “don’t get bored,” “don’t get tired,” and “AI models evolves”
- Can use various languages. Good at porting:
- Julia -> C++, C++ -> Rust, C -> Julia
- TypeScript, Swift, Flutter/Dart
Can adapt to requests. Migration to modern languages is possible
Requires appropriate instructions from humans

Are humans unnecessary? (1)

LLMs have much knowledge but may not have cutting-edge knowledge like researchers.
- What to implement or not depends on human judgment.
Software engineers are freed from coding time but need the following capabilities:
- Which technologies to use when
- Designing architecture for high-quality software
- Higher-level decision-making ability
- Minimal domain knowledge to discuss with researchers is also required.

Are humans unnecessary? (2)

AI can slack off when writing code
- Satisfied with writing only placeholders
- Implements only special cases when a general implementation is needed

Not always best-practice implementation
- In Julia: doesn’t use multiple dispatch, overuses export
- In Rust: overuses .clone(), degrading performance
Minimal knowledge of how to build and run code, coding, and programming languages is necessary.

Are software engineers unnecessary?

This is an extreme claim such as:
- Can we fire researchers because AI can generate natural languages, which means it can write papers?
Jobs disappear when a certain passion disappears.
It’s important to never lose your curiosity.
- Even from SOTA internet memes like 좋아요 (likes) or you can have fun learning Korean.
- There is a chance to get a new job!!! (Maybe) See the 3rd party video: 스피키 성우님을 실제로 만났다 to learn more.

Demo

Introducing coding with Cursor.

Alternatives: Antigravity, Claude Code, Codex, and Copilot.

Which tool should you use? (1)

There is no single answer since AI agents are constantly evolving
Depends on which accounts you have
- If you have ChatGPT: try Codex, GPT-5.3
- If you have Claude: try Claude Code: Opus-4.6
Want to use for free
- Use GitHub Copilot, Google Antigravity
All services have usage limits. It all comes down to budget

Which tool should you use? (2)

As for me

Claude Code MAX $200/$100: Opus-4.6
ChatGPT Plus plan: Codex: GPT-5.3
Cursor Pro: Daily editor with Composer 1.5

Hardware

I like to use a ultrawide monitor

Required CLI tools

git, gh: Operate Git/GitHub
npm, node: Installing coding agents
uv: Python package manager
juliaup, julia: Required to run Julia code
rustup, rustc, cargo: Required to build and run Rust code

Optional:

tmux
docker
devcontainer
rg
tree
htop
quarto

Software Installation

See preparation.qmd

Installation (Cursor)

Go to the link below:

https://cursor.com/download
VS Code users can migrate extensions

Cursor UI

Math Quiz

Given two positive integers $a$ and $b$, what is the probability that their greatest common divisor $\gcd(a,b)$ equals 1?

Namely Prob(gcd(a, b) = 1) = ???.

The answer is

\[ \begin{aligned} \mathrm{Prob}(\gcd(a, b) = 1) &= \frac{1}{\zeta(2)} \\ &= \frac{6}{\pi^2} \\ &\approx 0.6079271018540267 > 0.5 \end{aligned} \]

where $\zeta$ is the Riemann zeta function.

\[ \zeta(s) = \sum_{n=1}^{\infty} \frac{1}{n^s}= \prod_{p: \mathrm{primes} } \left(\frac{1}{1 - p^{-s}} \right) \mathrm{} \]

The second equality is the equation known as the Euler product.

Proof

\[ \begin{aligned} \mathrm{Prob}(\gcd(a, b) = 1) &= \prod_{p: \mathrm{primes} } \left( 1 - \mathrm{Prob}(p | a \textrm{ and } p | b)\right) \\ &= \prod_{p: \mathrm{primes} } \left( 1 - \mathrm{Prob}(p | a)\mathrm{Prob}(p | b)\right) \\ &= \prod_{p: \mathrm{primes} } \left( 1 - \frac{1}{p}\cdot\frac{1}{p}\right) \\ &= \prod_{p: \mathrm{primes} } \left( 1 - p^{-2}\right) \\ &= \left(\prod_{p: \mathrm{primes} } \left(\frac{1}{1 - p^{-2}} \right)\right)^{-1} \\ &=\left(\zeta(2)\right)^{-1} \\ &= \frac{1}{\zeta(2)} = \frac{6}{\pi^2} \end{aligned} \]

JS Implementation (1)

From the proof above we obtain:

\[ \pi = \sqrt{\frac{6}{\mathrm{Prob}(\gcd(a, b)=1)}} \]

Let’s verify with numerical computation

// Function to approximate pi
// using probability that two numbers are coprime
function calcPi(N) {
    let cnt = 0;    // Counter for coprime pairs
    // Loop through all pairs (a, b) where 1 <= a, b <= N
    for (let a = 1; a <= N; a++) {
        for (let b = 1; b <= N; b++) {
            // Check if a and b are coprime
            if (mygcd(a, b) === 1) {
                cnt += 1;  // Increment counter if coprime
            }
        }
    }
    // Probability that two numbers are coprime
    let prob = cnt / (N * N);
    // Approximate pi using the formula: pi ≈ sqrt(6 / prob)
    return Math.sqrt(6 / prob);
}

JS Implementation (2)

Let’s calculate $\pi \approx 3.14...$ using the greatest common divisor:

// approx_pi_gcd.js

// Function to compute the greatest common divisor (GCD) using the Euclidean algorithm
function mygcd(a, b) {
    // Loop until the remainder is zero
    while (b !== 0) {
        let tmp = b;    // Store the value of b temporarily
        b = a % b;      // Update b to the remainder of a divided by b
        a = tmp;        // Set a to the previous value of b
    }
    return a;           // When b is zero, a is the GCD
}

// Function to approximate pi using probability that two numbers are coprime
function calcPi(N) {
    let cnt = 0;    // Counter for coprime pairs
    // Loop through all pairs (a, b) where 1 <= a, b <= N
    for (let a = 1; a <= N; a++) {
        for (let b = 1; b <= N; b++) {
            // Check if a and b are coprime
            if (mygcd(a, b) === 1) {
                cnt += 1;  // Increment counter if coprime
            }
        }
    }
    // Probability that two numbers are coprime
    let prob = cnt / (N * N);
    // Approximate pi using the formula: pi ≈ sqrt(6 / prob)
    return Math.sqrt(6 / prob);
}

// Main function to run the pi approximation
function main() {
    let N = 10000;            // Number limit for coprimality checking
    let pi = calcPi(N);       // Approximate pi
    console.log(`N: ${N}`);   // Output N
    console.log(`pi: ${pi}`); // Output approximation of pi
}

main(); // Call the main function

Result

$ node approx_pi_gcd.js
calcPi: 2.175s
N: 10000
pi: 3.141534239016629

Exercise

Port the JavaScript code above to Julia and Rust. Measure the execution time when N=10000.

Hint

Use the following prompts:

Port approx_pi_gcd.js to Julia and save the result as approx_pi_gcd.jl

Port approx_pi_gcd.js to Rust and save the result as approx_pi_gcd.rs

Answer (Cursor)

Answer (Julia)

# Function to compute the greatest common divisor (GCD) using the Euclidean algorithm
function mygcd(a, b)
    # Loop until the remainder is zero
    while b != 0
        tmp = b    # Store the value of b temporarily
        b = a % b  # Update b to the remainder of a divided by b
        a = tmp    # Set a to the previous value of b
    end
    return a       # When b is zero, a is the GCD
end

# Function to approximate pi using probability that two numbers are coprime
function calcPi(N)
    cnt = 0    # Counter for coprime pairs
    # Loop through all pairs (a, b) where 1 <= a, b <= N
    for a in 1:N
        for b in 1:N
            # Check if a and b are coprime
            if mygcd(a, b) == 1
                cnt += 1  # Increment counter if coprime
            end
        end
    end
    # Probability that two numbers are coprime
    prob = cnt / (N * N)
    # Approximate pi using the formula: pi ≈ sqrt(6 / prob)
    return sqrt(6 / prob)
end

# Main function to run the pi approximation
function main()
    N = 10000            # Number limit for coprimality checking
    approx_pi = @time calcPi(N) # Approximate pi and time it
    println("N: $(N)")   # Output N
    println("pi: $(approx_pi)") # Output approximation of pi
end

main() # Call the main function

Result

$ julia approx_pi_gcd.jl
  1.865392 seconds
N: 10000
pi: 3.141534239016629

Answer (Rust)

// Function to compute the greatest common divisor (GCD) using the Euclidean algorithm
fn mygcd(a: u64, b: u64) -> u64 {
    let mut a = a;
    let mut b = b;
    // Loop until the remainder is zero
    while b != 0 {
        let tmp = b;    // Store the value of b temporarily
        b = a % b;      // Update b to the remainder of a divided by b
        a = tmp;        // Set a to the previous value of b
    }
    a                   // When b is zero, a is the GCD
}

// Function to approximate pi using probability that two numbers are coprime
fn calc_pi(n: u64) -> f64 {
    let mut cnt = 0u64; // Counter for coprime pairs
    // Loop through all pairs (a, b) where 1 <= a, b <= N
    for a in 1..=n {
        for b in 1..=n {
            // Check if a and b are coprime
            if mygcd(a, b) == 1 {
                cnt += 1;  // Increment counter if coprime
            }
        }
    }
    // Probability that two numbers are coprime
    let prob = cnt as f64 / (n as f64 * n as f64);
    // Approximate pi using the formula: pi ≈ sqrt(6 / prob)
    (6.0 / prob).sqrt()
}

// Main function to run the pi approximation
fn main() {
    let n = 10000u64;   // Number limit for coprimality checking
    let start = std::time::Instant::now();
    let pi = calc_pi(n); // Approximate pi
    let duration = start.elapsed();
    println!("calcPi: {:?}", duration);
    println!("N: {}", n);   // Output N
    println!("pi: {}", pi); // Output approximation of pi
}

Result

$ rustc -C opt-level=3 approx_pi_gcd.rs && ./approx_pi_gcd
calcPi: 1.754083333s
N: 10000
pi: 3.141534239016629

Answer (C)

#include <stdio.h>
#include <math.h>
#include <time.h>

// Function to compute the greatest common divisor (GCD) using the Euclidean algorithm
int mygcd(int a, int b) {
    // Loop until the remainder is zero
    while (b != 0) {
        int tmp = b;    // Store the value of b temporarily
        b = a % b;      // Update b to the remainder of a divided by b
        a = tmp;        // Set a to the previous value of b
    }
    return a;           // When b is zero, a is the GCD
}

// Function to approximate pi using probability that two numbers are coprime
double calcPi(int N) {
    int cnt = 0;    // Counter for coprime pairs
    // Loop through all pairs (a, b) where 1 <= a, b <= N
    for (int a = 1; a <= N; a++) {
        for (int b = 1; b <= N; b++) {
            // Check if a and b are coprime
            if (mygcd(a, b) == 1) {
                cnt += 1;  // Increment counter if coprime
            }
        }
    }
    // Probability that two numbers are coprime
    double prob = (double)cnt / (N * N);
    // Approximate pi using the formula: pi ≈ sqrt(6 / prob)
    return sqrt(6.0 / prob);
}

// Main function to run the pi approximation
int main() {
    int N = 10000;            // Number limit for coprimality checking
    clock_t start = clock();
    double pi = calcPi(N);    // Approximate pi
    clock_t end = clock();
    double cpu_time_used = ((double)(end - start)) / CLOCKS_PER_SEC;

    printf("calcPi: %f seconds\n", cpu_time_used);
    printf("N: %d\n", N);     // Output N
    printf("pi: %f\n", pi);   // Output approximation of pi

    return 0;
}

Result

$ gcc -O3 approx_pi_gcd.c && ./a.out
calcPi: 1.753796 seconds
N: 10000
pi: 3.141534

Julia can execute this much faster.

# Function to approximate pi using probability that two numbers are coprime
function calcPi(N)
    cnt = 0    # Counter for coprime pairs
    # Loop through all pairs (a, b) where 1 <= a, b <= N
    for a in 1:N
        for b in 1:N
            # Check if a and b are coprime
            if gcd(a, b) == 1
                cnt += 1  # Increment counter if coprime
            end
        end
    end
    # Probability that two numbers are coprime
    prob = cnt / (N * N)
    # Approximate pi using the formula: pi ≈ sqrt(6 / prob)
    return sqrt(6 / prob)
end

# Main function to run the pi approximation
function main()
    N = 10000            # Number limit for coprimality checking
    approx_pi = @time calcPi(N) # Approximate pi and time it
    println("N: $(N)")   # Output N
    println("pi: $(approx_pi)") # Output approximation of pi
end

main() # Call the main function

Result

julia approx_pi_gcd.jl
  1.454210 seconds
N: 10000
pi: 3.141534239016629

Julia got faster?!

Why did it go from 1.865392 seconds to 1.454210 seconds?

Because Julia’s gcd function uses a different algorithm than the Euclidean algorithm.

Exercise

Clone the Julia repository and have the coding agent explain how the gcd function works
Have it answer what algorithm is used
Have it port Julia’s gcd implementation to replace mygcd in the code we ported from JavaScript to Rust

Hint

git clone --depth 1 https://github.com/JuliaLang/julia.git enables fast cloning (shallow clone). Use the following prompt:

Read the Julia code and answer how the gcd function is defined. Read @julia

Port these gcd functions to Rust

Answer

Indeed, the code is as follows:

# binary GCD (aka Stein's) algorithm
# about 1.7x (2.1x) faster for random Int64s (Int128s)
# Unfortunately, we need to manually annotate this as `@assume_effects :terminates_locally` to work around #41694.
# Since this is used in the Rational constructor, constant folding is something we do care about here.
@assume_effects :terminates_locally function _gcd(ain::T, bin::T) where T<:BitInteger
    zb = trailing_zeros(bin)
    za = trailing_zeros(ain)
    a = abs(ain)
    b = abs(bin >> zb)
    k = min(za, zb)
    while a != 0
        a >>= za
        absd, diff = absdiff(a, b)
        za = trailing_zeros(diff)
        b = min(a, b)
        a = absd
    end
    r = b << k
    return r % T
end

Answer (Cursor with Auto-mode)

Answer (Rust implementation)

// Port of Julia's gcd functions from base/intfuncs.jl
// Greatest common divisor implementations using Euclidean and Binary GCD algorithms

use std::ops::{Rem, Shr, Shl, Sub, BitAnd};

/// Greatest common (positive) divisor (or zero if all arguments are zero).
///
/// This is the general implementation for all integer types using the Euclidean algorithm.
pub fn gcd<T>(mut a: T, mut b: T) -> T
where
    T: Copy + PartialEq + Rem<Output = T> + From<u8> + Default,
{
    let zero = T::from(0u8);
    while b != zero {
        let t = b;
        b = a % b;
        a = t;
    }
    checked_abs(a)
}

/// Greatest common divisor for fixed-width integer types using Binary GCD (Stein's) algorithm.
///
/// This is about 1.7x (2.1x) faster for random Int64s (Int128s) compared to Euclidean algorithm.
pub fn gcd_bitinteger<T>(a: T, b: T) -> T
where
    T: Copy
        + PartialEq
        + PartialOrd
        + Shr<usize, Output = T>
        + Shl<usize, Output = T>
        + BitAnd<Output = T>
        + Sub<Output = T>
        + From<u8>
        + Default,
{
    let zero = T::from(0u8);

    if a == zero {
        return checked_abs(b);
    }
    if b == zero {
        return checked_abs(a);
    }

    // Handle typemin case for signed types
    // In Rust, we check for the minimum value differently
    // For now, we'll proceed with the algorithm

    _gcd(a, b)
}

/// Binary GCD (aka Stein's) algorithm implementation.
///
/// This is the core algorithm that's faster for fixed-width integers.
fn _gcd<T>(ain: T, bin: T) -> T
where
    T: Copy
        + PartialEq
        + PartialOrd
        + Shr<usize, Output = T>
        + Shl<usize, Output = T>
        + BitAnd<Output = T>
        + Sub<Output = T>
        + From<u8>
        + Default,
{
    let zero = T::from(0u8);

    let zb = trailing_zeros(bin);
    let mut za = trailing_zeros(ain);
    let mut a = abs(ain);
    let mut b = abs(bin >> zb);
    let k = if za < zb { za } else { zb };

    while a != zero {
        a = a >> za;
        let (absd, diff) = absdiff(a, b);
        za = trailing_zeros(diff);
        b = if a < b { a } else { b };
        a = absd;
    }

    let r = b << k;
    r
}

/// Compute absolute difference between two values.
/// Returns (absolute difference, signed difference).
fn absdiff<T>(x: T, y: T) -> (T, T)
where
    T: Copy + PartialOrd + Sub<Output = T> + From<u8>,
{
    if x >= y {
        let d = x - y;
        (d, d)
    } else {
        let d = y - x;
        (d, d)
    }
}

/// Compute the number of trailing zeros in the binary representation.
///
/// For unsigned types, this counts trailing zeros.
/// For signed types, we convert to unsigned first.
fn trailing_zeros<T>(x: T) -> usize
where
    T: Copy,
{
    // This is a simplified version. In practice, you'd use the built-in
    // trailing_zeros method for each integer type.
    // For a generic implementation, we'd need trait bounds that include
    // the actual integer traits.

    // For now, we'll use a basic implementation
    // In real code, you'd specialize for each integer type
    if let Some(val) = try_to_u64(x) {
        val.trailing_zeros() as usize
    } else {
        // Fallback: count manually
        count_trailing_zeros_fallback(x)
    }
}

/// Try to convert a value to u64 for trailing_zeros calculation.
fn try_to_u64<T>(_x: T) -> Option<u64> {
    None // Generic implementation - would need specialization
}

/// Fallback trailing zeros counting.
fn count_trailing_zeros_fallback<T>(_x: T) -> usize {
    0 // Placeholder - would need proper implementation
}

/// Compute absolute value with overflow checking.
fn checked_abs<T>(x: T) -> T
where
    T: Copy,
{
    // For unsigned types, abs is a no-op
    // For signed types, we'd use checked_abs
    // This is a simplified version
    abs(x)
}

/// Compute absolute value.
fn abs<T>(x: T) -> T
where
    T: Copy,
{
    // For unsigned types, this is a no-op
    // For signed types, we'd need to handle negation
    // This is a simplified generic version
    x
}

// Specialized implementations for specific integer types

/// GCD for u8
pub fn gcd_u8(a: u8, b: u8) -> u8 {
    gcd_bitinteger_u8(a, b)
}

/// Binary GCD for u8 (Stein's algorithm matching Julia's implementation)
pub fn gcd_bitinteger_u8(ain: u8, bin: u8) -> u8 {
    if ain == 0 {
        return bin;
    }
    if bin == 0 {
        return ain;
    }

    let zb = bin.trailing_zeros() as usize;
    let mut za = ain.trailing_zeros() as usize;
    let mut a = ain;
    let mut b = bin >> zb;
    let k = za.min(zb);

    while a != 0 {
        a >>= za;
        let absd = if a >= b { a - b } else { b - a };
        let diff = absd;
        za = diff.trailing_zeros() as usize;
        b = a.min(b);
        a = absd;
    }

    b << k
}

/// GCD for u16
pub fn gcd_u16(a: u16, b: u16) -> u16 {
    gcd_bitinteger_u16(a, b)
}

/// Binary GCD for u16 (Stein's algorithm matching Julia's implementation)
pub fn gcd_bitinteger_u16(ain: u16, bin: u16) -> u16 {
    if ain == 0 {
        return bin;
    }
    if bin == 0 {
        return ain;
    }

    let zb = bin.trailing_zeros() as usize;
    let mut za = ain.trailing_zeros() as usize;
    let mut a = ain;
    let mut b = bin >> zb;
    let k = za.min(zb);

    while a != 0 {
        a >>= za;
        let absd = if a >= b { a - b } else { b - a };
        let diff = absd;
        za = diff.trailing_zeros() as usize;
        b = a.min(b);
        a = absd;
    }

    b << k
}

/// GCD for u32
pub fn gcd_u32(a: u32, b: u32) -> u32 {
    gcd_bitinteger_u32(a, b)
}

/// Binary GCD for u32 (Stein's algorithm matching Julia's implementation)
pub fn gcd_bitinteger_u32(ain: u32, bin: u32) -> u32 {
    if ain == 0 {
        return bin;
    }
    if bin == 0 {
        return ain;
    }

    let zb = bin.trailing_zeros() as usize;
    let mut za = ain.trailing_zeros() as usize;
    let mut a = ain;
    let mut b = bin >> zb;
    let k = za.min(zb);

    while a != 0 {
        a >>= za;
        let absd = if a >= b { a - b } else { b - a };
        let diff = absd;
        za = diff.trailing_zeros() as usize;
        b = a.min(b);
        a = absd;
    }

    b << k
}

/// GCD for u64
pub fn gcd_u64(a: u64, b: u64) -> u64 {
    gcd_bitinteger_u64(a, b)
}

/// Binary GCD for u64 (Stein's algorithm matching Julia's implementation)
pub fn gcd_bitinteger_u64(ain: u64, bin: u64) -> u64 {
    if ain == 0 {
        return bin;
    }
    if bin == 0 {
        return ain;
    }

    let zb = bin.trailing_zeros() as usize;
    let mut za = ain.trailing_zeros() as usize;
    let mut a = ain;
    let mut b = bin >> zb;
    let k = za.min(zb);

    while a != 0 {
        a >>= za;
        let absd = if a >= b { a - b } else { b - a };
        let diff = absd;
        za = diff.trailing_zeros() as usize;
        b = a.min(b);
        a = absd;
    }

    b << k
}

/// GCD for u128
pub fn gcd_u128(a: u128, b: u128) -> u128 {
    gcd_bitinteger_u128(a, b)
}

/// Binary GCD for u128 (Stein's algorithm matching Julia's implementation)
pub fn gcd_bitinteger_u128(ain: u128, bin: u128) -> u128 {
    if ain == 0 {
        return bin;
    }
    if bin == 0 {
        return ain;
    }

    let zb = bin.trailing_zeros() as usize;
    let mut za = ain.trailing_zeros() as usize;
    let mut a = ain;
    let mut b = bin >> zb;
    let k = za.min(zb);

    while a != 0 {
        a >>= za;
        let absd = if a >= b { a - b } else { b - a };
        let diff = absd;
        za = diff.trailing_zeros() as usize;
        b = a.min(b);
        a = absd;
    }

    b << k
}

/// GCD for i8
pub fn gcd_i8(a: i8, b: i8) -> i8 {
    gcd_bitinteger_i8(a, b)
}

/// Binary GCD for i8 with signed integer handling (matching Julia's algorithm)
pub fn gcd_bitinteger_i8(ain: i8, bin: i8) -> i8 {
    if ain == 0 {
        return bin.abs();
    }
    if bin == 0 {
        return ain.abs();
    }

    // Handle typemin case
    if ain == i8::MIN && ain == bin {
        panic!("gcd({}, {}) overflows", ain, bin);
    }

    let zb = bin.unsigned_abs().trailing_zeros() as usize;
    let mut za = ain.unsigned_abs().trailing_zeros() as usize;
    let mut a = ain.unsigned_abs();
    let mut b = bin.unsigned_abs() >> zb;
    let k = za.min(zb);

    while a != 0 {
        a >>= za;
        let absd = if a >= b { a - b } else { b - a };
        let diff = absd;
        za = diff.trailing_zeros() as usize;
        b = a.min(b);
        a = absd;
    }

    (b << k) as i8
}

/// GCD for i16
pub fn gcd_i16(a: i16, b: i16) -> i16 {
    gcd_bitinteger_i16(a, b)
}

/// Binary GCD for i16 with signed integer handling (matching Julia's algorithm)
pub fn gcd_bitinteger_i16(ain: i16, bin: i16) -> i16 {
    if ain == 0 {
        return bin.abs();
    }
    if bin == 0 {
        return ain.abs();
    }

    if ain == i16::MIN && ain == bin {
        panic!("gcd({}, {}) overflows", ain, bin);
    }

    let zb = bin.unsigned_abs().trailing_zeros() as usize;
    let mut za = ain.unsigned_abs().trailing_zeros() as usize;
    let mut a = ain.unsigned_abs();
    let mut b = bin.unsigned_abs() >> zb;
    let k = za.min(zb);

    while a != 0 {
        a >>= za;
        let absd = if a >= b { a - b } else { b - a };
        let diff = absd;
        za = diff.trailing_zeros() as usize;
        b = a.min(b);
        a = absd;
    }

    (b << k) as i16
}

/// GCD for i32
pub fn gcd_i32(a: i32, b: i32) -> i32 {
    gcd_bitinteger_i32(a, b)
}

/// Binary GCD for i32 with signed integer handling (matching Julia's algorithm)
pub fn gcd_bitinteger_i32(ain: i32, bin: i32) -> i32 {
    if ain == 0 {
        return bin.abs();
    }
    if bin == 0 {
        return ain.abs();
    }

    if ain == i32::MIN && ain == bin {
        panic!("gcd({}, {}) overflows", ain, bin);
    }

    let zb = bin.unsigned_abs().trailing_zeros() as usize;
    let mut za = ain.unsigned_abs().trailing_zeros() as usize;
    let mut a = ain.unsigned_abs();
    let mut b = bin.unsigned_abs() >> zb;
    let k = za.min(zb);

    while a != 0 {
        a >>= za;
        let absd = if a >= b { a - b } else { b - a };
        let diff = absd;
        za = diff.trailing_zeros() as usize;
        b = a.min(b);
        a = absd;
    }

    (b << k) as i32
}

/// GCD for i64
pub fn gcd_i64(a: i64, b: i64) -> i64 {
    gcd_bitinteger_i64(a, b)
}

/// Binary GCD for i64 with signed integer handling (matching Julia's algorithm)
pub fn gcd_bitinteger_i64(ain: i64, bin: i64) -> i64 {
    if ain == 0 {
        return bin.abs();
    }
    if bin == 0 {
        return ain.abs();
    }

    if ain == i64::MIN && ain == bin {
        panic!("gcd({}, {}) overflows", ain, bin);
    }

    let zb = bin.unsigned_abs().trailing_zeros() as usize;
    let mut za = ain.unsigned_abs().trailing_zeros() as usize;
    let mut a = ain.unsigned_abs();
    let mut b = bin.unsigned_abs() >> zb;
    let k = za.min(zb);

    while a != 0 {
        a >>= za;
        let absd = if a >= b { a - b } else { b - a };
        let diff = absd;
        za = diff.trailing_zeros() as usize;
        b = a.min(b);
        a = absd;
    }

    (b << k) as i64
}

/// GCD for i128
pub fn gcd_i128(a: i128, b: i128) -> i128 {
    gcd_bitinteger_i128(a, b)
}

/// Binary GCD for i128 with signed integer handling (matching Julia's algorithm)
pub fn gcd_bitinteger_i128(ain: i128, bin: i128) -> i128 {
    if ain == 0 {
        return bin.abs();
    }
    if bin == 0 {
        return ain.abs();
    }

    if ain == i128::MIN && ain == bin {
        panic!("gcd({}, {}) overflows", ain, bin);
    }

    let zb = bin.unsigned_abs().trailing_zeros() as usize;
    let mut za = ain.unsigned_abs().trailing_zeros() as usize;
    let mut a = ain.unsigned_abs();
    let mut b = bin.unsigned_abs() >> zb;
    let k = za.min(zb);

    while a != 0 {
        a >>= za;
        let absd = if a >= b { a - b } else { b - a };
        let diff = absd;
        za = diff.trailing_zeros() as usize;
        b = a.min(b);
        a = absd;
    }

    (b << k) as i128
}

#[cfg(test)]
mod tests {
    use super::*;

    #[test]
    fn test_gcd_u64() {
        assert_eq!(gcd_u64(6, 9), 3);
        assert_eq!(gcd_u64(6, 0), 6);
        assert_eq!(gcd_u64(0, 0), 0);
        assert_eq!(gcd_u64(48, 18), 6);
        assert_eq!(gcd_u64(17, 13), 1);
    }

    #[test]
    fn test_gcd_i64() {
        assert_eq!(gcd_i64(6, 9), 3);
        assert_eq!(gcd_i64(6, -9), 3);
        assert_eq!(gcd_i64(-6, 9), 3);
        assert_eq!(gcd_i64(-6, -9), 3);
        assert_eq!(gcd_i64(6, 0), 6);
        assert_eq!(gcd_i64(0, 0), 0);
    }

    #[test]
    fn test_gcd_u32() {
        assert_eq!(gcd_u32(48, 18), 6);
        assert_eq!(gcd_u32(100, 25), 25);
    }

    #[test]
    fn test_gcd_i32() {
        assert_eq!(gcd_i32(48, -18), 6);
        assert_eq!(gcd_i32(-100, 25), 25);
    }
}

// Function to approximate pi using probability that two numbers are coprime
fn calc_pi(n: u64) -> f64 {
    let mut cnt = 0u64; // Counter for coprime pairs
    // Loop through all pairs (a, b) where 1 <= a, b <= N
    for a in 1..=n {
        for b in 1..=n {
            // Check if a and b are coprime using the fast binary GCD algorithm
            if gcd::gcd_u64(a, b) == 1 {
                cnt += 1;  // Increment counter if coprime
            }
        }
    }
    // Probability that two numbers are coprime
    let prob = cnt as f64 / (n as f64 * n as f64);
    // Approximate pi using the formula: pi ≈ sqrt(6 / prob)
    (6.0 / prob).sqrt()
}

// Main function to run the pi approximation
fn main() {
    let n = 10000u64;   // Number limit for coprimality checking
    let start = std::time::Instant::now();
    let pi = calc_pi(n); // Approximate pi
    let duration = start.elapsed();
    println!("calcPi: {:?}", duration);
    println!("N: {}", n);   // Output N
    println!("pi: {}", pi); // Output approximation of pi
}

Result

rustc -C opt-level=3 gcd.rs && ./gcd
calcPi: 1.2510425s
N: 10000
pi: 3.141534239016629

Advanced topics

Summary of measurement results so far

Algorithm	Language	Execution time
Euclidean algorithm	JavaScript	2.175 s
	Julia	1.865392 s
	Rust	1.754083 s
	C	1.753796 s
Binary GCD algorithm	Julia	1.454210 s
	Rust	1.251042 s

How can we achieve Rust-equivalent performance in Julia?

The Binary GCD Algorithm can be made even faster

Exercise

Further optimize the Binary GCD Algorithm (1.45 -> 1.07 seconds on Julia)

Hint

Implement the Binary GCD Algorithm in Rust and measure execution time
If the Rust implementation is faster than Julia, compare the Rust binary with the assembly output from Julia’s @code_native macro
Fewer instructions improve execution speed; feed the comparison results back into the Julia code

Answer

Opus 4.5 generated this!

Rust

Benchmark:
Run 1: 1.05132725s  pi=3.141534239016629
Run 2: 1.0358485s  pi=3.141534239016629
Run 3: 1.034275125s  pi=3.141534239016629

Julia

$ julia src/main.jl
Running tests...
All tests passed!

Benchmark:
Run 1: 1.072994208s  pi=3.141534239016629
Run 2: 1.07328325s  pi=3.141534239016629
Run 3: 1.072947583s  pi=3.141534239016629

Another example

What is the expected number of random numbers (generated uniformly) such that their sum exceeds one?

Refer this discussion on MathOverflow.

Exercise

Write a numerical computing program. Try it in various programming languages

Answer (Julia)

function count_upto_one()
    counter = 0
    accumulated = 0.0
    while true
        accumulated += rand()
        counter += 1
        if accumulated >= 1.0
            break
        end
    end
    return counter
end

function main()
    n_trial =1e8
    total_count=0
    for trial in 1:n_trial
        total_count+=count_upto_one()
    end
    println(total_count/n_trial)
end

main()

Appendix

Are software engineers unnecessary? (1)

AI adoption frees us from the following local concerns:
- Language learning cost
- Very fine-grained language features and syntax
- Coding style
We should focus on the following global discussions:
- How to design software with good user experience
- Which technologies and languages to choose for each situation
- How to maintain high-quality software
- What to implement or not
Bottleneck investigation with visual profilers is difficult to fully automate with AI. Such workflows still exist
- Optimization at millisecond, microsecond, nanosecond levels relies on human intuition. Demand remains

Are software engineers unnecessary? (2)

Gaining domain knowledge is important
- Physics knowledge sufficient to communicate with physicists
Rust code can run on the web via WebAssembly.
- Work may emerge for publishing and maintaining numerical computing demos on the web

Are software engineers unnecessary? (3)

Researchers will get busier.
- They can do what they couldn’t before but wanted to.
- Work will increase accordingly
Researchers won’t be idle.
- One researcher subscribed to Claude Code MAX $200 and got dizzy from full AI usage
- I recommended downgrading to the $100 plan. Sounds like a joke but it’s true
{SWE task} $\subset$ {Researcher's demand} $\setminus$ {Things researchers actually can do}

Are software engineers unnecessary? (4)

Jobs disappear when a certain passion disappears.
It’s important to never lose your curiosity.

Introduction to Coding Agents

Overview

Self-introduction

Actually

Recent work: Organizing SparseIR ecosystem

Recent work: Development with tensor4all members

Recent work: mVMC

Recent work: SubsetJuliaVM

Recent work: RustCall.jl

Recent work: better-pluto-client

AI Driven Development

In reality

Build reproducible software for us!

Humans and machines (1)

Humans and machines (2)

The rise of coding agents

Humans and machines (3)

Humans and machines (4)

Are humans unnecessary? (1)

Are humans unnecessary? (2)

Are software engineers unnecessary?

Demo

Which tool should you use? (1)

Which tool should you use? (2)

As for me

Hardware

Required CLI tools

Software Installation

Installation (Cursor)

Cursor UI

Math Quiz

Proof

JS Implementation (1)

JS Implementation (2)

Exercise

Hint

Answer (Cursor)

Answer (Julia)

Answer (Rust)

Answer (C)

Julia can execute this much faster.

Julia got faster?!

Exercise

Hint

Answer

Answer (Cursor with Auto-mode)

Answer (Rust implementation)

Advanced topics

The Binary GCD Algorithm can be made even faster

Exercise

Hint

Answer

Another example

Exercise

Answer (Julia)

Appendix

Are software engineers unnecessary? (1)

Are software engineers unnecessary? (2)

Are software engineers unnecessary? (3)

Are software engineers unnecessary? (4)

Recent work: Development with `tensor4all` members

Recent work: `mVMC`