Porting free: Linux-like memory statistics tool for MacOS
free
command for MacOS. You can find the source code and installation instructions on the project repository on GitHub: free-mac.Introduction and Motivation
As a graduate student, my work often involves running complex simulations and data analysis tasks. Typically, these tasks are executed on powerful university clusters or at CERN machines, which boast robust hardware capabilities, particularly in terms of memory. However, when working locally on my poor Macbook, I encountered a challenge: MacOS lacks a direct equivalent of Linux’s memory reporting tools, which are integral to my workflow.
To address this gap, I embarked on a quest to develop a free, Mac compatible tool that replicates the memory statistics functionality of Linux. This blog post chronicles my journey, from the initial motivation to the technical intricacies of the tool’s development.Memory management in macOS is known for its efficiency, often attributed to the sharing of large blocks of read-only memory between applications. Linux, on the other hand, has a complex memory management system with many configurable settings, accessible via the /proc
filesystem and adjustable using sysctl
. Despite these differences, I needed a tool that could provide me with a clear and comprehensive view of memory usage on my Mac, similar to what I was accustomed to on Linux. free -h
command was my go-to tool for this purpose, and I wanted to replicate its functionality on macOS because I couldn’t find a suitable alternative.
The tool I developed is a command-line program that fetches and displays memory statistics in a human-readable format. It’s designed to provide detailed information about total, used, free, cached, application, and wired memory, along with swap usage for macOS systems. The program leverages the Mach API
and sysctl
to gather memory statistics, which are not as straightforward to access on macOS as they are on Linux. There are some differences between how MacOS handle cached memory.
Technical details
The Mach API allows us to interact with the low-level features of macOS, which is different from the Linux approach where memory information is typically read from files in the /proc
directory. In my program, I use mach_host_self()
function to get the Mach port1 for the host, and host_page_size()
to determine the page size2, which is essential for calculating memory usage.
I then fetch the total physical memory using host_info()
3 with HOST_BASIC_INFO
4, and detailed memory statistics using host_statistics64()
5 with HOST_VM_INFO64
. These functions provide a wealth of information about the system’s memory, which I then use to calculate the different components of memory usage.
Now I have the building blocks to calculate most of the memory types. Still there is one particular type that I could not find a way to calculate based on this information which is Swap. I will talk on how we can an estimate for the swap usage later when I talk about the implementation.
Implementation
First we need to include the necessary headers.
#include <stdio.h>
#include <mach/mach.h>
#include <sys/types.h>
#include <sys/sysctl.h>
The first header is stdio.h
which is needed for printf
and snprintf
functions. The second header is mach/mach.h
which is needed for the Mach API functions. The third header is sys/types.h
which is needed for size_t
type. The last header is sys/sysctl.h
which is needed for sysctl
function.
I needed to have the same format as free -h
because I’m so used to how it looks. So I needed a a way to emulate that. I wrote a function called formatBytes
that converts the number of bytes into a human-readable format with appropriate suffixes like KB, MB, GB, etc. This function is crucial for presenting the data in a way that’s easy to understand at a glance.
void formatBytes(unsigned long long bytes, char *buffer, int bufferSize, int human) {
if(human == 0) {
snprintf(buffer, bufferSize, "%llu", bytes);
return;
}
const char *suffixes[] = {"B", "KB", "MB", "GB", "TB"};
int suffixIndex = 0;
double result = bytes;
// Loop to determine the appropriate suffix and reduce the bytes accordingly
while (result > 1024 && suffixIndex < 5) {
result /= 1024.0;
suffixIndex++;
}
// Format the result with the determined suffix
snprintf(buffer, bufferSize, "%.2f %s", result, suffixes[suffixIndex]);
}
In the formatBytes
function, bytes are converted into a human-readable format. It uses a boolean human to decide whether to format the data. If human is false, it simply prints the byte count. Otherwise, it selects the appropriate unit (B, KB, MB, GB, TB) by dividing the byte count by 1024 repeatedly until the count is small enough for the unit. This loop both finds the right unit and reduces the byte count to a human-friendly size. The result is then formatted into a string using snprintf, considering the buffer size to prevent overflow.
Now let’s get back to the swap problem, MacOS does not provide simple way like linux (which can be read from /proc/swaps
) to get the swap usage. So I had to find a way to get an estimate for the swap usage. I figured that this can be done by using sysctl
with CTL_VM
and VM_SWAPUSAGE
to do that.
Now we are ready for a main function to rule them all. If I was better programmer, I would have written a function for each of the memory types. But I’m not, so I wrote a single function that calculates all the memory types.
I start the main function by define some structs and variables that will be used later.
int main() {
// Initialize Mach port and page size variables
mach_port_t host_port = mach_host_self();
vm_size_t page_size;
host_page_size(host_port, &page_size);
// Fetch total physical memory using host basic info
host_basic_info_data_t hostInfo;
mach_msg_type_number_t info_count = HOST_BASIC_INFO_COUNT;
if (host_info(host_port, HOST_BASIC_INFO, (host_info_t)&hostInfo, &info_count) != KERN_SUCCESS) {
fprintf(stderr, "Failed to get total memory\n");
return 1;
}
// Fetch detailed memory statistics using VM statistics
vm_statistics64_data_t vm_stat;
mach_msg_type_number_t host_size = sizeof(vm_statistics64_data_t) / sizeof(integer_t);
if (host_statistics64(host_port, HOST_VM_INFO64, (host_info_t)&vm_stat, &host_size) != KERN_SUCCESS) {
fprintf(stderr, "Failed to get memory statistics\n");
return 1;
}
}
I start with the initialization of a mach_port_t
variable, named host_port
, by calling the mach_host_self
function, which returns the Mach port for the host. Following this, a vm_size_t
variable named page_size
is declared and filled with the page size in bytes by invoking the host_page_size
function with host_port
and the address of page_size
. Subsequently, a host_basic_info_data_t
structure named hostInfo
is declared to hold basic information about the host, and a mach_msg_type_number_t
variable, info_count
, is initialized with HOST_BASIC_INFO_COUNT
, representing the number of elements in hostInfo
. The host_info
function is then used with these parameters to populate hostInfo
with the host’s basic information. If the host_info
function call does not succeed, meaning it returns a value other than KERN_SUCCESS
, an error message is printed and 1 is returned to indicate an error occurred.
After that, a vm_statistics64_data_t
structure named vm_stat
is declared to hold detailed virtual memory statistics. The mach_msg_type_number_t
variable host_size
is then initialized with the size of vm_stat
divided by the size of integer_t
. The host_statistics64
function is called with these parameters to fill vm_stat
with the host’s detailed virtual memory statistics. If the host_statistics64
function call does not succeed, which means it returns a value other than KERN_SUCCESS
, an error message is printed and 1 is returned to indicate that an error occurred.
Now its the time to write a function to get the swap information from syst1
.
Here is the code that does that:
// Get swap information using sysctl
struct xsw_usage swapinfo;
size_t swapinfo_sz = sizeof(swapinfo);
int mib[2] = {CTL_VM, VM_SWAPUSAGE};
if (sysctl(mib, 2, &swapinfo, &swapinfo_sz, NULL, 0) != 0) {
perror("sysctl");
return 1;
}
The function begins by declaring a structure of type xsw_usage
named swapinfo
to hold swap usage information. It also declares a size_t
variable swapinfo_sz
and initializes it with the size of swapinfo
. An integer array mib
of size 2 is declared and initialized with the constants CTL_VM
and VM_SWAPUSAGE
which are used to specify the information to be retrieved. The sysctl
function is then called with these parameters to fill swapinfo
with the swap usage information. The sysctl
function reads and/or writes kernel parameters and in this case, it is used to read the swap usage information. If the sysctl
function call fails (i.e., it returns a value other than 0), it prints an error message using the perror
function and returns 1 to indicate that an error occurred.
Now every piece of information is ready for use to begin our calculations. First thing to do now is to declare some variables for different memory types.
// Declare variables for different memory types
unsigned long long total_memory = hostInfo.max_mem;
unsigned long long free_memory = (unsigned long long)(vm_stat.free_count - vm_stat.speculative_count) * page_size;
unsigned long long wired_memory = (unsigned long long)vm_stat.wire_count * page_size;
unsigned long long app_memory = (unsigned long long)(vm_stat.internal_page_count - vm_stat.purgeable_count) * page_size;
unsigned long long cached_memory = (unsigned long long)(vm_stat.purgeable_count + vm_stat.external_page_count) * page_size;
unsigned long long used_memory = total_memory - free_memory - cached_memory;
The total_memory
variable is set to the maximum memory available on the host, which is obtained from the max_mem
field of the hostInfo
structure. The free_memory
is calculated by subtracting the speculative page count (speculative_count
) from the free page count (free_count
), both obtained from the vm_stat
structure. The result is then multiplied by the page size (page_size
) to convert the count into bytes. The wired_memory
variable is is calculated by multiplying the wired page count (wire_count
from vm_stat
) by the page size. The wired_memory
is calculated by multiplying the wired page count (wire_count
from vm_stat
) by the page size. Wired memory is memory that can’t be paged out to disk
. The app_memory
variable is calculated by subtracting the purgeable page count (purgeable_count
from vm_stat
) from the internal page count (internal_page_count
from vm_stat
). The result is then multiplied by the page size. The cached_memory
calculated by adding the purgeable page count and the external page count (external_page_count
from vm_stat
), and then multiplying the result by the page size. And finally, the used_memory
variable is calculated by subtracting the free memory and the cached memory from the total memory. This gives the amount of memory that is currently being used.
All what is left is some formatting and print staetment to show the output .
// Formatting memory sizes for human-readable output
char totalStr[20], usedStr[20], freeStr[20], cachedStr[20], appStr[20], wiredStr[20];
char swapTotalStr[20], swapUsedStr[20], swapFreeStr[20];
// Convert memory and swap statistics into human-readable format
formatBytes(total_memory, totalStr, sizeof(totalStr), 1);
formatBytes(used_memory, usedStr, sizeof(usedStr), 1);
formatBytes(free_memory, freeStr, sizeof(freeStr), 1);
formatBytes(cached_memory, cachedStr, sizeof(cachedStr), 1);
formatBytes(app_memory, appStr, sizeof(appStr), 1);
formatBytes(wired_memory, wiredStr, sizeof(wiredStr), 1);
formatBytes(swapinfo.xsu_total, swapTotalStr, sizeof(swapTotalStr), 1);
formatBytes(swapinfo.xsu_used, swapUsedStr, sizeof(swapUsedStr), 1);
formatBytes(swapinfo.xsu_avail, swapFreeStr, sizeof(swapFreeStr), 1);
// Printing formatted results
printf("%20s %14s %14s %14s %14s %14s\n", "total", "used", "free", "cached", "app", "wired");
printf("Mem: %15s %14s %14s %14s %14s %14s\n", totalStr, usedStr, freeStr, cachedStr, appStr, wiredStr);
printf("Swap: %14s %14s %14s\n", swapTotalStr, swapUsedStr, swapFreeStr);
return 0;
This code is used to format the previously calculated memory statistics into a human-readable format and print them. First, it declares several character arrays (totalStr
, usedStr
, freeStr
, cachedStr
, appStr
, wiredStr
, swapTotalStr
, swapUsedStr
, swapFreeStr
) to hold the formatted memory sizes. Then, it calls the formatBytes
function for each memory statistic. This function as I have shown will convert the memory sizes from bytes into a human-readable format (like KB, MB, GB, etc.) and stores the result in the corresponding character array. The sizeof
operator is used to pass the size of each array to the formatBytes
function. Finally, it prints the formatted memory statistics using the printf function. The %20s, %14s, etc. are format specifiers that indicate that a string should be printed with a specific width. The printf function is called three times to print the headers (“total”, “used”, “free”, etc.), the memory statistics, and the swap statistics.
Usage
Our program is ready to be used. To compile it, we can use the following command:
gcc -o memory_info memory_info.c
To run it, we can use the following command:
./memory_info
The output should look like this:
./memory_info
total used free cached app wired
Mem: 16.00 GB 12.63 GB 46.88 MB 3.33 GB 5.81 GB 1.95 GB
Swap: 0.00 B 0.00 B 0.00 B
Developing this tool which I did because I wanted to be more aware on memory usage on my poor mac was a fun experience. I learned a lot about how memory is managed on macOS and how to use the Mach API (for the first time). This is a simple tool that anyone probably can write in a better way and will never be used by anyone. And to be honest I always advice anyone no to run any of the code I write not to bother relying on it. But I just did it and wanted to write about it.