Woo hoo! The portable hardware locality project (hwloc) has finally hit release candidate status. Much has changed since the v0.9 series, all of it for the better. There’s an impressive array of features and other goodness contained in the upcoming v1.0 release (if I do say so myself — although the INRIA guys did most of the heavy lifting). Check out the release announcement, or read below the jump for an abbreviated list of the new stuff.
I don’t normally make hooplah over release candidates, but we’d actually like to get people to give this stuff a whirl before it hits v1.0 so that we can iron out any kinks.
And if you’re wondering why a high-performance networking blog cares about a server-side software project that appears to have nothing to do with networking, read some of my prior posts. Short version: this stuff already somewhat matters for networking performance. It’s going to matter (much) more as time goes on.
Here’s a partial list of changes since the v0.9 series (which may change before v1.0 is final):
- The ABI of the library has changed.
- Backend updates
- Add FreeBSD support.
- Add x86 cpuid based backend.
- Add Linux cgroup support to the Linux cpuset code.
- Support binding of entire multithreaded process on Linux.
- Cleanup XML export/import.
- HWLOC_OBJ_PROC is renamed into HWLOC_OBJ_PU for “Processing Unit”, its stringified type name is now “PU”.
- Use new HWLOC_OBJ_GROUP objects instead of MISC when grouping objects according to NUMA distances or arbitrary OS aggregation.
- Rework memory attributes.
- Add different cpusets in each object to specify processors that are offline, unavailable, …
- Cleanup the storage of object names and DMI infos.
- Add support for looking up specific PID topology information.
- Add hwloc_topology_export_xml() to export the topology in a XML file.
- Add hwloc_topology_get_support() to retrieve the supported features for the current topology context.
- Support non-SYSTEM object as the root of the tree, use MACHINE in most common cases.
- Add hwloc_get_*cpubind() routines to retrieve the current binding of processes and threads.
- Add HWLOC_API_VERSION to help detect the currently used API version.
- Add missing ending “e” to *compare* functions.
- Add several routines to emulate PLPA functions.
- Rename and rework the cpuset and/or/xor/not/clear operators to output their result in a dedicated argument instead of modifying one input.
- Deprecate hwloc_obj_snprintf() in favor of hwloc_obj_type/attr_snprintf().
- Clarify the use of parent and ancestor in the API, do not use father.
- Replace hwloc_get_system_obj() with hwloc_get_root_obj().
- Return -1 instead of HWLOC_OBJ_TYPE_MAX in the API since the latter isn’t public.
- Relax constraints in hwloc_obj_type_of_string().
- Improve displaying of memory sizes.
- Add 0x prefix to cpuset strings.
- lstopo now displays logical indexes by default, use –physical to revert back to OS/physical indexes.
- Add colors in the lstopo graphical outputs to distinguish between online, offline, reserved, … objects.
- Extend lstopo to show cpusets, filter objects by type, …
- Renamed hwloc-mask into hwloc-calc which supports many new options.
- Add a hwloc(7) manpage containing general information.
- Add documentation about how to switch from PLPA to hwloc.
- Cleanup the distributed documentation files.
- Many compilers warning fixes.
- Cleanup the ABI by using the visibility attribute.
- Add project embedding support.