I was trying to shave compute units off my program when I realized I had no idea what happens at the entrypoint.
Every Solana program starts somewhere. That somewhere is a function with a specific name and signature. Most developers use the entrypoint! macro, watch it work, and never think about it again.
But when you’re optimizing, you need to understand the boundary. What does entrypoint! actually do? How does Rust code become callable by a C-based runtime?
Strip away the macro, and you’re left with a C function that takes a pointer and returns a u64. The runtime passes a pointer to a flat memory region containing account data, instruction data, and the program ID. Your code unpacks that pointer into safe Rust types, runs your logic, and returns a status code.
That’s the interface. Everything else—AccountInfo, Pubkey, ProgramResult—is built on top of this foundation.
The C ABI Boundary
The Solana runtime is written in Rust, but it treats your program as a black box. It doesn’t know about your types, your trait implementations, or your lifetime annotations. It knows one thing: there’s a function called entrypoint that follows the C calling convention.
extern "C" tells the Rust compiler to expose a symbol using the C calling convention. This is the ABI that defines which eBPF registers carry arguments, how the stack is structured, and which register returns a value. The runtime calls your program’s entrypoint by raw symbol name and passes a pointer in a specific register.
Sticking to the C ABI guarantees this works whether you compile from Rust, C, Zig, or hand-written assembly. Every LLVM-based language targeting eBPF follows the same convention.
Here’s what the actual entrypoint looks like without the macro:
#[no_mangle] pub unsafe extern "C" fn entrypoint(input: *mut u8) -> u64 { // your code here }
One pointer in. One status code out. That’s the contract.
The Flat Memory Layout
The pointer the runtime passes points to a flat record stored on the BPF VM’s input page. This record contains everything an instruction needs: the number of accounts, the raw account data laid out sequentially, the instruction payload length, the instruction data itself, and the program ID.
Each account in that layout follows this structure:
pub struct AccountRaw { is_duplicate: u8, is_signer: u8, is_writable: u8, executable: u8, alignment: u32, key: [u8; 32], owner: [u8; 32], lamports: u64, data_len: usize, data: [u8; data_len], padding: [u8; 10_240], rent_epoch: i64, }
The 10,240 bytes of padding exist so account data can grow in place without moving memory. The alignment field keeps the struct 8-byte aligned. #[repr(C)] forces Rust to arrange these fields exactly as a C compiler would, locking field order and padding so the byte layout never changes.
Every abstraction you use—AccountInfo, Signer<'info>, the entire Anchor account model—is built by parsing this flat structure.
What entrypoint! Actually Does
The entrypoint! macro does three things:
First, it unpacks the raw pointer into safe Rust types. It walks the flat memory layout, calculates offsets, and constructs &Pubkey, &[AccountInfo], and &[u8] references pointing to the correct regions of the input buffer.
Second, it catches panics and converts them to ProgramError codes. If your code panics, the macro catches it and returns a non-zero u64 status code to the runtime instead of crashing the VM.
Third, it creates that extern "C" wrapper around your process_instruction function so the runtime can find and call it.
Here’s what you write:
entrypoint!(process_instruction); fn process_instruction( program_id: &Pubkey, accounts: &[AccountInfo], instruction_data: &[u8], ) -> ProgramResult { // your logic }
Here’s what the macro generates:
#[no_mangle] pub unsafe extern "C" fn entrypoint(input: *mut u8) -> u64 { // Read account_len from first 8 bytes let account_len = *(input as *const u64); // Walk the input buffer, construct AccountInfo slice let accounts = /* pointer math to build &[AccountInfo] */; // Find instruction_data offset, construct slice let instruction_data = /* pointer math to build &[u8] */; // Read program_id from its offset let program_id = /* pointer math to build &Pubkey */; // Call your function, catch panics match std::panic::catch_unwind(|| { process_instruction(program_id, accounts, instruction_data) }) { Ok(Ok(())) => 0, Ok(Err(e)) => e.into(), Err(_) => /* panic error code */, } }
The macro is just a parser. It knows the memory layout, calculates the offsets, and hands you safe references.
Syscalls
BPF bytecode cannot hash, log, or invoke other programs on its own. For privileged operations, programs call named syscalls: sol_log, sol_sha256, sol_invoke_signed.
Each syscall is declared as an extern "C" function. The program loads arguments into eBPF registers r1 through r5—up to five parameters depending on the syscall—then executes a call instruction with the syscall’s identifier. The runtime executes the requested operation in privileged mode and returns a status code in r0—zero for success.
Here’s the signature for cross-program invocation:
extern "C" { fn sol_invoke_signed_c( instruction_addr: *const u8, // r1 account_infos_addr: *const u8, // r2 account_infos_len: u64, // r3 signers_seeds_addr: *const u8, // r4 signers_seeds_len: u64, // r5 ) -> u64; // r0 }
Five registers for arguments. One register for the return value. The C ABI maps function parameters directly to eBPF register assignments.
The runtime expects the pointers to reference C-compatible structs with #[repr(C)] layout:
#[repr(C)] struct SolInstruction { program_id_addr: u64, accounts_addr: u64, accounts_len: usize, data_addr: u64, data_len: usize, } #[repr(C)] struct SolAccountMeta { pubkey_addr: u64, is_writable: bool, is_signer: bool, }
When you call invoke_signed from solana_program, it’s building these structs, casting them to raw pointers, loading them into registers, and calling sol_invoke_signed_c. The nice API is a wrapper around register manipulation and pointer casts.
In Practice
To prove these concepts work, I built a crateless vault using only pointer arithmetic and syscalls. Here’s what account validation looks like without the macro.
The first account must be a signer, writable, non-duplicate, and non-executable. Those four flags sit in adjacent bytes at the start of the account record. Instead of checking each flag individually, a single u32 load validates all four:
const ACCOUNT_OFFSET: usize = 8; // First account starts after account_len // Check is_duplicate (0xff), is_signer (0x01), is_writable (0x01), executable (0x00) // Packed into u32: 0x0101ff (little-endian) if *(input.add(ACCOUNT_OFFSET) as *const u32) != 0x0101ff { return 1003; // InvalidAccountData } // Read the signer's key for later use let signer_key = *(input.add(ACCOUNT_OFFSET + 8) as *const [u8; 32]);
One load. Four validations. This is the optimization the flat layout enables.
For the second account, the vault needs to verify it’s a PDA derived from the signer’s key. This requires calling sol_sha256:
extern "C" { fn sol_sha256(vals: *const u8, val_len: u64, hash_result: *mut [u8; 32]) -> u64; } // Build the input for PDA derivation let bump: u8 = instruction_data[0]; let data = [ signer_key.as_ref(), &[bump], PROGRAM_ID.as_ref(), b"ProgramDerivedAddress", ]; // Hash the inputs let mut pda = core::mem::MaybeUninit::<[u8; 32]>::uninit(); sol_sha256(&data as *const _ as *const u8, 4, pda.as_mut_ptr()); // Verify it matches the vault key let vault_key = *(input.add(ACCOUNT_2_OFFSET + 8) as *const [u8; 32]); if vault_key != *pda.as_ptr() { return 1013; // InvalidSeeds }
The syscall takes an array of slices, a count, and a pointer to write the result. Raw pointers in, raw result out. The solana_program version wraps this in safe types, but the underlying operation is identical.
What This Means
The entrypoint is an extern "C" function that takes a pointer and returns a u64. The pointer references a flat memory layout containing account metadata, instruction data, and the program ID. The entrypoint! macro parses that layout using known byte offsets and constructs safe Rust references.
Every framework—solana_program, Anchor, Pinocchio—parses the same flat structure. AccountInfo wraps pointers to specific offsets in the input buffer. Signer<'info> adds compile-time checks on top of those same pointers. The zero-copy accessors in Pinocchio eliminate intermediate copies but still read from the same memory region.
Syscalls follow the same pattern. The high-level API builds #[repr(C)] structs, casts them to pointers, loads them into registers r1-r5, and executes a call instruction. The return value lands in r0.
This matters for optimization. Reading account flags is cheap because it’s a direct load from a known offset. Resizing account data is expensive because it requires reallocating the flat buffer and updating all downstream pointers. Cross-program invocations cost what they cost because building those #[repr(C)] structs and validating them isn’t free.
The abstractions exist for safety and correctness. But knowing what sits underneath—pointer arithmetic, register conventions, and a memory layout—makes the performance characteristics predictable.