**Encrypting**Wait while your phone is being encrypted.

Time remaining 00:00

After forcibly rebooting the phone, any password (even the wrong one) would result in a message like:

**Decryption unsuccessful**The password you entered is correct, but unfortunately your data is corrupt. To resume using your phone, you need to perform a factory reset. When you set up your phone after the reset, you’ll have an opportunity to restore any data that was backed up to your Google Account.

This same message appeared even after the suggested factory reset – from this state, the only way I could find to restore the phone to working order was to format the data partition.

I looked into the source code, and it turns out that this problem happens if the encryption finishes successfully but the final writing of encryption metadata to the metadata partition (or the final test mount) fails for some reason. Normally the phone would restart soon after completion and the time remaining screen would disappear, but in this case nothing tells the user interface to report that an error occurred so it just sits there.

Debugging this sort of problem is not too difficult if you can plug in the phone via USB and view live logs with `adb logcat`

. The good news is that the logs output from the encryption process are quite verbose. The bad news is that since the encryption process is done as part of a restart – with only a minimal system running and without user configuration loaded – `adbd`

may not be running at this point to allow connection. To allow debugging this, I built my own Android image with adb enabled at boot by making these changes. (Building Android from source is daunting but easier than you would think… mainly it just uses a lot of Internet bandwidth and disk space.)

In my case – unofficial LineageOS 16.0 on my osprey device – the root problem turned out to be incorrect SELinux rules that denied `vold`

(the volume manager daemon) permission to the metadata partition:

D vold : cryptfs_enable_inplace_f2fs success E Cryptfs : Cannot open footer file /dev/block/bootdevice/by-name/metadata for put W Binder:256_2: type=1400 audit(0.0:19): avc: denied { read write } for uid=0 name="mmcblk0p26" dev="tmpfs" ino=9642 scontext=u:r:vold:s0 tcontext=u:object_r:block_device:s0 tclass=blk_file permissive=0 I Cryptfs : cryptfs_check_passwd E Cryptfs : Cannot open footer file /dev/block/bootdevice/by-name/metadata for get E Cryptfs : Error getting crypt footer and key E Cryptfs : Could not get footer E Cryptfs : Encrypted filesystem not validated, aborting

To work around this I added a hack to `system/sepolicy/vold.te`

in my Android build tree to allow `vold`

access to all block devices… I’m sure there is a better way but this helped me to get back up and running quickly.

allow vold block_device:blk_file { create setattr unlink rw_file_perms };

I hope this helps someone else if they run into this same frustrating problem!

]]>While WebEx triggers the bug most frequently, it is not a bug in WebEx: it has also been reported with Bluejeans and Gotomeeting and other conferencing solutions that use WebRTC technology. It is a bug in the PulseAudio backend in Chrome 79 and Chrome 80. I spent some time tracking this down, although someone else beat me to fixing it (no complaints). Here is the bug report and the fix.

The good news is that it’s fixed in Chrome 81.0.4044.62 and later, so the solution is to upgrade Chrome/Chromium. I am making this blog post as a note for anyone else who is looking for the solution to this problem.

**Technical detail:**Every 1.5s WebEx refreshes the microphone/speaker/camera list by calling the JavaScript function

`navigator.MediaDevices.enumerateDevices()`

. When using PulseAudio, Chromium’s underlying implementation of that function calls, amongst other functions, `AudioManagerPulse::GetInputStreamParameters()`

which is where the bug is. This function fails to lock the PulseAudio mainloop lock before calling into `pa_context_get_source_info_by_name()`

in the PulseAudio library. The `pa_context_get_source_info_by_name()`

function sends a request to the PulseAudio server and enqueues a dispatch entry so that the library knows what to do when the response arrives. Because of the lack of locking, there is a race condition: if another Chrome thread is executing in the PulseAudio library at the same time, the PulseAudio library’s dispatch queues can become corrupted.

The magnetization M is 0 outside of a material. So if we assume that H is due to free currents only, one would conclude that the magnetic field/flux B outside a material is also a function of free currents only – e.g. the external magnetic field of a solenoid would be independent of the material the core is made of. However, as seen in my previous post, this is not true. Also, one might conclude that permanent magnets have no external magnetic field, which is clearly not true either…!

Actually while ∇.B = 0 everywhere (there are no magnetic monopoles), ∇.H ≠ 0. The H field has sources wherever ∇.M ≠ 0:

(1)

In other words if we want to calculate the magnitude of the H field throughout space, not only do we need to consider free currents (which produce circulation in the H field), but also material boundaries and other areas where M is non-uniform (producing sources in the H field).

Nonetheless, in typical elementary problems where we integrate around a closed loop, only the curl is important and we can still write for example ∮H.dl = I_{piercing} (plus the dE/dt term if relevant).

Props to Griffiths’ Introduction to Electrodynamics section 6.3.2 for helping to clearing up this confusion in my mind.

]]>It is well known that the magnetic field will be significantly stronger inside such a solenoid, compared to an air core solenoid. The ferromagnetic core becomes temporarily magnetized and reinforces the magnetic field. But what I’m more interested in is: what is the effect of the ferromagnetic core on the field outside the solenoid, some arbitrary distance away? This question is interesting when using such a coil as part of a magnetically-coupled data or power transfer system, or when considering the electromagnetic interference produced by such a solenoid.

Some sources seem to suggest that the field will be confined more closely when a ferromagnetic core is used, as is the case for ferrite-core transformers. Other sources seem to suggest the opposite, namely that the ferromagnetic core increases the range of the field. Meanwhile a naive application of the Biot-Savart law would suggest that the field outside the solenoid does not change. Which of these is true?

To answer this question, I ran a simple simulation with and without an iron core in the excellent FEMM field solver. Here is a picture of the results:

This is an axisymmetric simulation around the z axis. To picture the results in 3D, mentally revolve the half-circle about the z axis at the left. The outer sphere can be ignored, it is just the boundary of the simulation volume.

The conclusion from the simulation is that — at least for a fixed current — the magnetic field is higher outside the solenoid (at any given distance) when an iron core is added.

But surely as we learnt in physics classes, we should be able to calculate the external magnetic field by summing up all the current elements of the solenoid, per the Biot-Savart law? Why does adding an iron core make a difference?

I believe the explanation is that there are “bound currents” at the boundaries of the iron core which also need to be included in any such piecewise calculation of the magnetic field. This stackexchange answer has some good diagrams that help to explain how these bound currents come about.

Now it should be noted that the above simulation has assumed a constant current of 1A in both cases. When the iron core is added, the inductance of the solenoid increases and we had to pump more energy into the solenoid to produce this 1A steady state — so perhaps it’s not surprising that the field is stronger outside the solenoid. To make a more “fair” comparison, I reduced the current through the iron core solenoid to produce the same total energy storage:

Notice that, at least for this geometry, the external magnetic field is still higher at any given distance.

So why do people consider iron/ferrite cores to confine fields more tightly? Well, the issue with the above solenoid design (or perhaps feature if you are trying to increase range) is that the return path of the magnetic field lines is through air and not through iron. We have boosted the magnetic field within the iron, but we’ve ended up with most of the field energy in the air.

If we were to put the iron all the way around the solenoid, which is closer to the usual design of transformers, the result does indeed confine the fields tightly (N.B. I have used the same colour mapping here so that it can be directly compared):

I hope that this post helps to clear up some confusion and might be of help the next time someone is searching for information about the external magnetic fields of ferrite core solenoids (like I was earlier today)! If you think I have got anything wrong, please let me know.

]]>The first step in the process was to make a collection of existing designs that we both liked. Heather already knew that she liked diamonds with a step cut shape (such as the emerald cut) so this narrowed it down. Here is one of the pages of our initial design exploration:

We visited a local jeweller, who showed us some designs in similar styles. There were a few features that were missing though; for example, I really liked the gap between the center stone and halo in the vintage ring. For practical reasons I also wanted to design a ring that had a low side profile, as I had nightmares about Heather losing a ring finger in machinery.

Granted, I’m sure a good jeweller could have designed something fitting the bill, but I had been doing a bit of CAD recently and I thought… how hard would it be to design a ring myself?

(The answer, it turns out, is that it’s surprisingly hard if you haven’t done it before. I did intermittently find myself wishing that I was working more closely with a jeweller, but it was a labour of love…!)

Here you can see the evolution of my design in Onshape, from an initial shape, to the detailed design of halo and shoulder, and then some final changes based on manufacturing feedback:

The final design was manufactured using a lost wax casting process. In this process, the model is first 3D printed in a wax-like resin. A flask is placed around the model and flooded with a plaster-like investment material. Once the investment has hardened, the wax model is melted out, and metal is cast in the chamber that remains.

The printing and casting was done by Pure Casting in Sydney, who did an amazing job. I wasn’t sure how the design on the shoulder would turn out; the raised design is only about 0.2mm high, and about the same in line width, so it requires very high resolution. But it turned out perfectly. Pure Casting also arranged polishing and stone setting for me. (I thought briefly about doing it myself, but this time I decided against it!)

Here is a photo of a draft cast in silver, unpolished and without stones:

And here is the final ring:

Here are some notes about this design:

- In the final design, the ring is manufactured in two pieces and soldered together at the end, otherwise it would be difficult to polish in between the parts. (Thank you to Gillian at Pure Casting for this and other good advice.)
- The majority of ring designs use round/cylindrical prongs, but I used rectangular prongs here. I felt that this better suited the angular nature of the emerald cut, and I also think that this shape may be slightly better mechanically (a round prong with the same diameter has less metal, while a round prong with the same area is harder to bend over). Originally I added taper angles to the prongs, but it was suggested to keep them simple and leave finishing details to the setter.
- The majority of designs use flat or nearly-flat halos, but each face of this halo is inclined outwards significantly (35 degrees). Again, I wanted to echo the faces of the emerald cut, as well as providing a softer side profile to the ring. This means, however, that each of the eight faces of the halo is in a different plane, and this made the placement of the halo diamonds and prongs more challenging.
- To make sure the dimensions were right, I measured the actual diamond dimensions and angles from macro photos, and built a CAD model of the diamond. I did find a lot of curiosities; for example, I was also expecting the corners of an emerald cut to be cut at 45 degrees, but in fact they were around 42 degrees, and one corner was slightly ‘off’. The key with any gem setting, therefore, is to design in some tolerance.

The biggest challenge in the design was the engraving-like design on the shoulder of the ring. I was inspired by vines but, not having the artistic ability to draw vines, I made it more geometric. The final engraving design took all of my CAD knowledge, and some spline mathematics, to implement.

Heather and I are both thrilled with the end result. Thank you again to Onshape for building an awesome CAD system, and to Pure Casting for turning my design into reality.

]]>There are two FeatureScript functions that can be used for creating splines directly in 3D: opFitSpline() and opCreateBSplineCurve().

opFitSpline() is exactly equivalent to the skFitSpline() function described earlier, except instead of working in the context of a 2D sketch (with an implicit sketch plane), it directly draws a curve in 3D. This can be more convenient than going via a sketch, and the curve need not be constrained to a single plane. The parameters are exactly the same as skFitSpline(), except that all of the points are specified as 3D instead of 2D vectors.

opCreateBSplineCurve() is the most powerful of the spline functions, and also the most daunting. It takes a set of B-spline knots and control points (refer back to part 1 for a crash course on B-splines). Unlike the other functions, this function lets you create splines of arbitrary degree, not just cubic splines. It also allows you to create closed splines (more on this shortly), and it allows you to create rational splines aka NURBS. I won’t go into rational splines in this article, but they allow you to draw certain curves that wouldn’t otherwise be possible to express in B-spline form.

To demonstrate use of these functions, here’s a 3D object that I’ve created by drawing a closed cross-sectional profile with opCreateBSplineCurve() and a guide path with opFitSpline(). I then swept the profile along the guide path with opSweep() and thickened the resulting surface with opThicken() to create a 3D object.

(I’ve left the guide path in the picture for illustration purposes.)

I’m not sure what this object actually is, perhaps it’s a handle for something, or an art piece for my future Onshape museum. But you can see how this might be useful if we need to create a computed surface in an actual engineering problem. Here is the full code:

// cross-sectional profile (closed spline in YZ plane) // explanation follows after code var degree = 3; var isClosed = true; var cpts = [vector(0,0,0)*meter, vector(0,-10,30)*meter, vector(0,30,10)*meter, vector(0,10,0)*meter]; // repeat first 3 (degree) control points at end cpts = concatenateArrays([cpts,resize(cpts,degree)]); var knots = range(0,1,len(cpts)+degree+1); var bspline = bSplineCurve(degree, isClosed, cpts, knots); opCreateBSplineCurve(context, id + "spline1", { "bSplineCurve" : bspline }); // path for sweep (interpolated spline in XY plane) opFitSpline(context, id + "spline2", { "points" : [ vector( 0, 0, 0) * meter, vector( 50, 20, 0) * meter, vector( 100, 0, 0) * meter ], "startDerivative" : vector( 30, 0, 0 ) * meter, "endDerivative" : vector( 30, 0, 0 ) * meter }); // sweep spline1 along spline2 opSweep(context, id + "sweep", { "profiles" : qCreatedBy(id + "spline1", EntityType.EDGE), "path" : qCreatedBy(id + "spline2", EntityType.EDGE), "keepProfileOrientation" : true }); // thicken to create solid body opThicken(context, id + "thicken", { "entities" : qCreatedBy(id + "sweep", EntityType.BODY), "thickness1" : 0 * meter, "thickness2" : 1 * meter }); // normally we'd delete the sweep path // opDeleteBodies(context, id + "deletespline2", { // "entities" : qCreatedBy(id + "spline2") // });

One thing that warrants explanation is how I’ve selected the B-spline parameters to create a closed (‘periodic’) spline. For all of the B-splines discussed in part 1, I repeated the first and last knot four times to ensure that the spline passed through the start and end point (the knot sequences looked like [0,0,0,0,0.25,1,1,1,1], for example). In this case, we don’t need or want to clamp the curve to fixed endpoints. Instead, I’ve chosen a uniform set of knots between t=0 and t=1 ([0,0.1,0.2,…,0.8,0.9,1]). The resulting curve does *not* pass through the first or last control point, but is still guided by the control polygon. I then repeat the first three control points at the end of the list of control points — wrapping part of the control polygon around the curve a second time — which results in a closed spline.

Here is a plot of the spline and control polygon that might help illustrate this. The control polygon starts at the origin (0,0) and goes clockwise around through (-10,30), (30,10), (10,0) and back to (0,0), then retraces (-10,30), (30,10), (10,0). The resulting spline follows the left dotted line from the origin to the top of the egg (from t=0 to t=0.3), traces the egg once (from t=0.3 to t=0.7), and returns to the origin via the right dotted line (from t=0.7 to t=1). When Onshape draws the curve, it only draws the middle part from t=0.3 to t=0.7, so the result is the closed egg shape. (For a spline with degree 3, the outer three knot spans are degenerate — in that the basis does not add to 1 and the spline tends towards the origin regardless of control points — so these parts of the spline are always ignored when drawing.)

Sometimes it’s necessary to generate a curve that’s “parallel” to another curve, maintaining a constant distance from it. In CAD this is called an **offset curve**. Here is an example in which I’ve drawn a curve and an offset curve at a given distance. I’ve then used the two curves to create a curved object with constant thickness:

Note that drawing an offset curve is not as simple as translating the curve (moving it in (x,y)). If I was to create this object by copying the bottom curve, dragging the copy towards the top left, and filling in the area between the two, I end up with something that is far from constant thickness:

Rather, to create the offset curve, each point on the curve needs to be moved in a slightly different direction: the direction normal to the curve at that point.

The Onshape sketch GUI has an “offset” tool which makes offsetting a curve easy. I needed to do this programmatically in FeatureScript, however, and there are no documented FeatureScript functions for offset curves. I tried a number of approaches ranging from mathematics to advanced Onshape-wrangling. If you’re just interested in how to do it, you can skip down a few sections to the code.

Let’s start from the simplest case, a 2D spline with one piece (a 2D Bézier curve). This is a simple polynomial which, as mentioned in part 1, we can write in the parametric form:

(1)

To produce the offset curve, each point on the curve needs to be translated by a constant distance (call it D) in a direction normal to the curve (n̂(t)). Thus we can write the offset curve as:

The unit normal n̂(t) is just the unit tangent rotated by 90 degrees. This 90 degree rotation is trivial; it is simply a co-ordinate rotation where (x,y) becomes (-y,x). (If we represent (x,y) in complex form as as x+iy we simply multiply by i, which results in ix+i^{2}y = ix-y.). So we can rewrite the above equations in terms of the unit tangent:

The tangent vector to a curve can be evaluated as (dx/dt, dy/dt), or in more convenient notation (ẋ(t),ẏ(t)), where a dot above a variable represents a derivative with respect to t. However, we need to scale the tangent to a unit vector by dividing by its length, sqrt(ẋ(t)^{2}+ẏ(t)^{2}). Substituting:

(2)

Now you can see where things are starting to get ugly. We can’t avoid the square root term in the denominator; D must multiply a unit vector or the new curve wouldn’t be a fixed distance away. In order to eliminate the square root in the denominator and turn this back into a polynomial, we have to rearrange and square… but then we no longer have x'(t) and y'(t) alone on one side.

It turns out that it’s not actually possible to express the offset curve in parametric polynomial form x'(t)=f(t) and y'(t)=g(t), regardless of the degree of f and g. Perusing the literature on the subject, we find that it’s theoretically possible to express the offset curve as a 10th order implicit polynomial, in the form f(x,y)=0 for some terrifying array of terms [1]. But this is about as useful to us as tits on a bull.

Because we can’t express the offset curve in parametric polynomial form, we can’t input it directly into Onshape as a spline. We can, however, use the above equations to evaluate the offset curve numerically at any point… assuming that we can easily evaluate ẋ(t) and ẏ(t).

If x(t) and y(t) are in polynomial form as in equation set 1, the first derivatives ẋ(t) and ẏ(t) follow trivially using the standard method of differentiating polynomials (ẋ(t) = 3at^{2} + 2bt + c, etc.). If x(t) and y(t) are in B-spline form, it turns out that there is also a simple method to generate the first derivative.

The derivative of a B-spline is a B-spline of one degree less, with the outer knots removed (e.g. a degree 3 spline with knots [0,0,0,0,0.25,1,1,1,1] becomes a degree 2 spline with knots [0,0,0,0.25,1,1,1]). The control points are replaced with the pairwise differences of control points (P_{2}−P_{1}, P_{3}−P_{2}, etc.), with some scaling corrections based on the knot distances. I’ve implemented this algorithm in the Octave/MATLAB function bsplinederiv.m.

Now we can easily evaluate the offset curve for a B-spline using the above equations. Here I’ve plotted the spline from the figures above, and its offset curve, numerically in Octave:

octave:1> degree = 3; octave:2> cpts = [0+0i;100-50i;50+150i;100+100i]; octave:3> knots = [0,0,0,0,1,1,1,1]; octave:4> [dcpts,dknots,ddegree] = bsplinederiv(degree,cpts,knots); octave:5> t = [0:0.01:1]; octave:6> offset = 5; octave:7> spline = bsplineeval(degree,cpts,knots,t); octave:8> splinederiv = bsplineeval(ddegree,dcpts,dknots,t); octave:9> unitnormal = normalize(splinederiv*i); octave:10> plot(spline,'-',spline+offset*unitnormal)

In theory we could import this into Onshape as a polyline approximation to the curve (made up of many line segments). With enough line segments, one could make the approximation ‘good enough’ for any given purpose. But this doesn’t seem very satisfying; it would be nice to have an actual curve at the Onshape kernel level so that there’s no risk of artefacts.

While we’ve found that the exact offset spline can’t be written as a parametric polynomial, another approach would be to find a polynomial that’s ‘close’ to it. There are many different ways of doing this. The method I tried was to start with two assumptions:

- The offset spline should start/end at points that are correctly offset from the start/end points of the original spline.
- The tangent to the offset spline should be parallel to the tangent of the original spline at these end points.

Neither of these requirements on the end points are strictly required; if we were to relax them we might be able to come up with curves that are a better approximation at other points. But constraints on the end points are convenient because they will make it possible to offset a spline piece-by-piece and have the offset splines match up.

For a simple one-piece spline, aka a Bézier curve, this only leaves two parameters unspecified: the scale values α and β on the start and end derivatives.

In order to lock down α,β let’s arbitrarily pick two more points along the offset curve and solve for the required α,β so the curve passes through these points:

It may seem counter-intuitive that we can fit a spline through four points *and* two tangent directions given that we only have eight free parameters in our parametric polynomial. Frankly, this had me scratching my head for a while too. But — referring back to the parametric polynomial defined in equation set 1 — note that when we specify an (x,y) without a t value, we’re providing (x,y) but introducing a new unknown for t, so one fit point only reduces the equation set by one unknown.

After a little algebra this produces the following equation set, where *t _{1}* and

Perhaps due to masochism, I had a burning desire to try and solve this algebraically, i.e. to try and find a closed form solution for the variables *α*,*β* in terms of known quantities. After all, it seems like a relatively simple geometric problem, and we have four equations and four unknowns.

A naïve approach might be to use the general cubic solution to rearrange equations (3) and (4) for *t _{1}*, and set the two expressions equal, eliminating

I also tried entering this equation set into a couple of symbolic computation packages (Maple and SAGE) and praying for a solution (solve()). In both cases they ran out of memory without a result. The reason is that they attempt to apply a general method for solving simultaneous polynomial equations, based on computing Gröbner bases (click if you dare, but honestly that Wikipedia page is pretty impenetrable to anyone who doesn’t already have a PhD in computing Gröbner bases). The method is exponential in the number of variables, and while it can handle large polynomials over a handful of variables, it is not a good method when there are many variables.

I threw down the gauntlet on Facebook, calling on maths geeks everywhere to help. To my disappointment, no-one tried to attack this algebraically; all of the suggestions leaned towards numerical approaches. One commenter (Kenneth Tsui) very kindly sent me an Excel spreadsheet that implemented the solution numerically using Newton’s method, an iterative method for successively improving an initial guess by gradient descent of the error function. The geometry of the problem is such that we can make good initial guesses (*α* = 1, *β* = 1, *t _{1}* = 1/3,

Personally I would never have thought to use Excel like this, but it turns out that even a hatchet can be used to hammer in a nail 😉

I tried this for a few splines. For simple splines the results are good, but when there’s high curvature the deviation can be noticeable (green is the real offset curve, red the approximation):

One could improve the result with a more targeted choice of points B and C, or by adding more freedom to the offset spline — perhaps a higher order polynomial, or more control points — but the mathematics was breaking my head already just trying to fit a cubic Bézier. Let’s go back a step and try another approach…

Remember that I mentioned that Onshape is able to create an offset curve quite effortlessly when using the offset tool in the sketch GUI. How does it do this?

Onshape has a handy feature that you can view the code generated by the GUI (right click on a Part Studio → View Code). By perusing the code produced by the offset tool, I did eventually figure out how to create an offset curve from FeatureScript. The key is to create a new spline using skSpline() or skSplineSegment(), and then add a DISTANCE constraint that constrains it to be a certain distance away from the first spline:

skFitSpline(sketch, "spline1", { "points" : [ vector(0,0) * meter, vector(100,100) * meter ], "startDerivative" : vector(300,-150) * meter, "endDerivative" : vector(150,-150) * meter }); skSplineSegment(sketch, "spline2", {}); skConstraint(sketch, "distance", { "constraintType" : ConstraintType.DISTANCE, "localFirst" : "spline1", "localSecond" : "spline2", "length" : 5 * meter, "direction" : DimensionDirection.MINIMUM, "alignment" : DimensionAlignment.ANTI_ALIGNED }); // not strictly required? skConstraint(sketch, "offset", { "constraintType" : ConstraintType.OFFSET, "localMaster" : "spline1", "localOffset" : "spline2" });

The OFFSET constraint does not seem to be required for the solver to do the right thing here, it may just be for GUI display purposes, but I’ve left it in the code example just in case.

There is one oddity with this. We have not specified any constraints on the endpoints, and sometimes the solver takes more liberty with the second endpoint than you might like. For example, depending on spline parameters, you can get something like this:

I’ve not been able to figure out the logic of how the solver chooses the second endpoint, however this is of purely academic interest — in practice, the requirements of the CAD problem will usually dictate physical constraints on the endpoints. If you want to force the second curve to end at the same point as the first curve, you can add an extra constraint between the endpoints, for example:

skConstraint(sketch, "distance.end", { "constraintType" : ConstraintType.DISTANCE, "localFirst" : "spline1.end", "localSecond" : "spline2.end", "length" : 5 * meter, "direction" : DimensionDirection.MINIMUM, "alignment" : DimensionAlignment.ANTI_ALIGNED });

In the earlier picture where I used an offset spline to create a 3D object, I needed to join the spline ends with two line segments to form a face. As in the sketch GUI, this can be done by using COINCIDENT constraints. (The code is simple but rather verbose; if you need many such constraints in your FeatureScript it may be worth writing a makeCoincident() function.)

skLineSegment(sketch, "line1", {}); skConstraint(sketch, "coincident.line1.start", { "constraintType" : ConstraintType.COINCIDENT, "localFirst" : "line1.start", "localSecond" : "spline1.start" }); skConstraint(sketch, "coincident.line1.end", { "constraintType" : ConstraintType.COINCIDENT, "localFirst" : "line1.end", "localSecond" : "spline2.start" }); skLineSegment(sketch, "line2", {}); skConstraint(sketch, "coincident.line2.start", { "constraintType" : ConstraintType.COINCIDENT, "localFirst" : "line2.start", "localSecond" : "spline1.end" }); skConstraint(sketch, "coincident.line2.end", { "constraintType" : ConstraintType.COINCIDENT, "localFirst" : "line2.end", "localSecond" : "spline2.end" });

The sketch GUI also has a “slot” tool that allows you to conveniently draw two offset curves, with semicircular end caps. This can be implemented in FeatureScript following the same approach as the above code, replacing skLineSegment() with skArc(); indeed under the bonnet this is what the slot tool generates.

While we can now create an offset curve in a sketch, we still haven’t really answered the question of how Onshape is doing this magic. Have the wizards at Onshape cracked the problem of how to derive offset spline parameters from the original spline?

While their wizardry is impressive, the answer is no, Onshape actually uses a much simpler method. It stores an offset curve as the original spline parameters *and* an offset value. The points of the offset curve can then be calculated numerically at evaluation time using an approach equivalent to equation set 2. If you’ve attempted to follow my mathematics above, you’ll now appreciate how much simpler this is.

Unfortunately, there does not seem to be any way (that I can find) to specify the offset value explicitly when creating a spline. For splines used in sketches, you can use a DISTANCE constraint as shown above, and the solver will assign the right offset to the second spline. This is a little tedious but works well. However it can’t be applied to splines created outside of sketches (opFitSpline() or opCreateBSplineCurve()); in this case there is no way that I can find to specify the offset value. I hope that the ability to specify an explicit offset might be added in a future version.

As a side note, while the Onshape computational engine evaluates curve points precisely where needed, the objects displayed in the GUI are built using straight edges. The rendering is not refined as you zoom in. Thus, attempting to visually place elements may produce unexpected results. For example, below is a line segment that is constrained to be co-incident with an spline; in the zoomed-in GUI, there is a gap you could drive a large bacterium through. The line segment does, however, touch the real spline.

For the most part the accuracy of the GUI approximation is a moot point of course. The underlying solver does evaluate the points precisely, and the generated geometry is correct. The take home message is that — as in any parametric CAD system — you should specify any constraints that you need and not rely on visual placement in the GUI.

When drawing offset curves on the concave side of a spline, there is a problem that sometimes arises: the offset curve may run into itself (in fact, this always happens for *some* offset distance, but it’s most problematic when the spline has high curvature). If we blindly apply equation set 2 and plot the curve that results, we see that the offset curve has a loop:

(If you’re surprised by the shape of this loop, take something like an eraser and sweep it along the path; there is a point where the top of the eraser has to start moving downwards to make it around the tight bend.)

In this case, Onshape throws up its hands and refuses to create the offset curve. Sometimes this can happen even when one is not explicitly working with offset curves: for instance, the “thicken” operation I used earlier in this article requires the calculation of offset curves, which may fail depending on the thickness value. In that example, a value of 1 meter works okay, but 2 meters fails (in either direction), and it’s not immediately obvious where in the curve the self-intersection is. This can be quite frustrating when working with parametric objects; a small tweak in parameters can cause the object to suddenly fail to render.

There do exist algorithms that can trim loops from offset curves. Even after trimming the loop, there is still the complication that there is a cusp (vertex) in the curve where the loop was, which changes the geometry of any objects built from this curve: for example, if you were to extrude the above curve, there is now an extra edge in the knee. I’ve noticed that Onshape refuses to draw splines with cusps, which is likely to avoid this problem of surplus geometry. One way around this would be to interpolate an arc at the cusp (Maya can do this, for example), but this clearly involves a lot of added complexity in the computation.

I do hope that Onshape considers supporting some form of offset curve trimming in the future, or at least giving more detailed feedback when curve problems occur. This would make the system considerably most robust. As it is, I’ve found myself ‘testing’ my curves in Octave/MATLAB before throwing them at Onshape, otherwise I can end up with a blank sheet and no idea where I went wrong.

There’s a lot I haven’t covered in this article. In order to understand how Onshape represents offset curves, I sniffed the Websocket communication between the browser and the backend, which was an interesting rabbit hole. While the Developer Tools in both Chrome and Firefox can display Websocket communication, neither could dump binary Websocket streams in any useful format that I could use for further dissection. So instead I used mitmproxy to interpose on the SSL session and Wireshark/tshark to capture the communication. I then ran into the further obstacle that the Websocket traffic was compressed, and the Internet-given dissector plugin for this was deficient in a number of ways. A quick crash course in Lua later, and I have an improved version of this plugin on my github account (websocket_deflate_postdissector).

As usual I’ve learnt a lot of things about a lot of things in the process, and hopefully some of this information will be useful to you too. However, it’s probably time to get back to my actual design project.

I later realised a better way to tackle equations (3)–(7) algebraically might be the method of resultants; this allows eliminating *t _{1}* from equations (3) and (4) without having to rearrange for

Perhaps numerical methods are not so bad, after all.

]]>First can I say how much I love Onshape. I have no association with the company other than being a user, so you can trust that this is real, genuine puppy love. Onshape provides much of the functionality of parametric CAD systems such as SolidWorks *in your browser*, which at first sight seems like magic. Since I’m a Linux user and the native CAD landscape on Linux is quite limited, being able to do CAD in a browser works brilliantly for me. Even better, it’s free for non-commercial use.

My first little project in Onshape, several moons ago, was a vertical illuminator for my home-brew microscope. The problem I had was that when using high magnifications — which require the lens very close — I couldn’t get enough light in under the lens to illuminate opaque samples. A vertical illuminator is the solution to this problem, sending light down through the lens via a beamsplitter. I designed the physical parts in Onshape, exported to STL format and uploaded to 3DHubs; a local maker printed the parts and within a few days I had it back and working.

Onshape has a scripting language called FeatureScript. FeatureScript allows extending the Onshape interface with domain-specific tools (*features* as they’re known in Onshape). For example, if you are doing furniture design in Onshape, you can write features that produce panels and joints and then do high-level design in the user interface, rather than having to work with low-level primitives such as lines and boxes. Clearly this can be a huge boon to productivity. Here is a screenshot from one of the Onshape demonstration videos showing a custom box joint feature that, once created, can be used just like built-in features:

As a programmer and perfectionist I like the idea of writing code to simplify the CAD process rather than spending hours placing parts in a GUI. Prior to Onshape I’ve used OpenSCAD a few times. But for complex designs it gets difficult to mentally keep track of what’s where without having a live preview, and OpenSCAD rendering can get very slow once you start chamfering, filleting and making things pretty. This is where Onshape and FeatureScript shine.

Recently I’ve been doing a bit of jewellery design in Onshape, and this involves a lot of curves and curved surfaces. While I can approximately create the curves I want in the GUI, I wanted to do it programmatically in FeatureScript. Unfortunately a number of the functions that I needed to use were minimally documented so I had to do a lot of trial and error in the process. This article is an attempt to explain some of the missing links for those following in my footsteps.

Onshape is still being developed at a breakneck pace, and since I started writing this article there are now a number of new features related to curves including the option to directly create splines in 3D. I’ll start, though, from the ‘traditional’ way of creating curves in Onshape — by creating them in a 2D sketch and then lofting/extruding/etc. — and I’ll briefly mention 3D curves later in the piece.

There’s a lot of mathematical jargon around splines and polynomials, so I’ll try to use both the technically correct terms and plain English where possible.

A spline is a piecewise polynomial. The curve is made up of one or more pieces, where each piece is a polynomial. The polynomials are normally chosen such that they “match up” at the transitions and you end up with something that looks like a single continuous curve. There can be various definitions of “matching up”, however to produce a visually smooth curve, at least the function values and the first derivative need to match (C1 continuity), and usually the second derivative is also chosen to match (C2 continuity).

In the case of Onshape splines used in sketches, the pieces are normally cubic polynomials — polynomials of maximum degree 3 — in two dimensions x and y. However, I should be clearer what I mean by that, as there could be at least three different things that could be meant by cubic polynomials in two dimensions:

Explicit form:y=f(x):

y = ax^{3}+ bx^{2}+ cx + d

(defined uniquely by 4 free constants:a,b,c,d)

Parametric form:x=f(t), y=g(t) for some parameter t:

x = at^{3}+ bt^{2}+ ct + d

y = et^{3}+ ft^{2}+ gt + h

(defined uniquely by 8 free constants:a,b,c,d,e,f,g,h)

Implicit form:f(x,y)=0:

ax^{3}+ bx^{2}y + cxy^{2}+ dy^{3}+ ex^{2}+ fxy + gy^{2}+ hx + ky + l = 0

(defined uniquely by 10 free constants:a,b,c,d,e,f,g,h,k,l)

The explicit form can produce, say, a parabola y=x^{2}, but can’t produce a parabola rotated by 90 degrees. Therefore it makes little sense for software such as Onshape which must be able to produce curves in any orientation.

The implicit form is the most powerful, in that it can express curves that can’t be expressed in the other two forms, but it is difficult to evaluate the set of points that are part of the curve.

Therefore, as might be expected, Onshape curves are of the parametric type: a curve in two dimensions is produced parametrically by evaluating functions x(t) and y(t) for t from 0 to 1. The functions x(t) and y(t) are splines: there may, for example, be one polynomial piece from t=0 to t=0.5 and one from t=0.5 to t=1. Some of the FeatureScript functions related to curves, such as evEdgeTangentLine() and evEdgeCurvature(), take this parameter t as an argument.

Splines can be created programmatically in an Onshape sketch using the skFitSpline() function. First let’s start with a curve with only one polynomial piece between two points (0,0) and (100,100). In FeatureScript we can write:

skFitSpline(sketch, "spline1", { "points" : [ vector(0,0) * meter, vector(100,100) * meter ], "startDerivative" : vector(150,0) * meter, "endDerivative" : vector(0,150) * meter });

Here is the result:

Note that I’ve specified a startDerivative and endDerivative that specify the starting and ending direction; if I had only specified a start and end point then the result would have just been a straight line.

When I was first experimenting with this, my first question was: why do the start and end derivatives have length units (in this case, 150 *meters*)? With a bit of experimentation it’s clear that these these vectors should be scaled together with the curve, i.e. if we scale the curve up by a factor of two we should also double the startDerivative and endDerivative. Also, a larger magnitude makes the curve launch with more momentum in the given direction; for example, here is the curve with startDerivative increased to (500 meters,0):

But what does the magnitude of these vectors actually mean?

It all becomes clearer, however, when you consider the curves in parametric form as described in the previous section: x=f(t) and y=f(t). It turns out that the given derivative vectors are the **derivatives of (x(t), y(t)) with respect to the parameter t (i.e. (dx/dt, dy/dt))** at t=0 and t=1. If this is too abstract to visualize, there is also a simple mapping to Bézier control points that I’ll explain below.

It turns out that the start point and end point, and the two derivative vectors – four (x,y) pairs in total – uniquely define the eight parameters of the parametric cubic polynomial. Thus, any parametric cubic polynomial can be specified in this way.

If you don’t specify the derivative at the start point or end point, the second derivative at that point is set to zero, which is sufficient to provide the missing constraint on the polynomial.

If you are familiar with Bézier curves the above discussion will sound familiar. A Bézier curve is defined by four control points (call them P_{1}, P_{2}, P_{3}, P_{4}). The curve launches from P_{1} in the direction of P_{2}, and then approaches P_{4} from the direction of P_{3}. If P_{2} is further away from P_{1}, then the curve launches with more momentum in the given direction.

It turns out that these formulations are equivalent with a tiny amount of maths, and if you have the Bézier control points, you can calculate the required startDerivative and endDerivative as:

(1)

where Δt will be 1 for the simple case that t varies from 0 to 1 across the curve segment, and the 3 arises from the degree of the polynomial (because d/dt(t^{3}) = 3t^{2}). Thus, for a single-piece spline, the startDerivative and endDerivative will be **three times the distance to the corresponding Bézier control point**.

Of course you can also calculate the Bézier control points from the startDerivative and endDerivative by rearranging these equations for P_{1} and P_{2}:

(2)

You can evaluate the spline at any point using the evEdgeTangentLine() function in FeatureScript, providing the parameter value t (from 0 to 1). The Line object that is returned has an origin that provides the evaluated point and has a direction that is tangent to the spline.

var curve = qCreatedBy(id + "sketch", EntityType.EDGE); var tangent = evEdgeTangentLine(context, { "edge": curve, "parameter": 0.5, "arcLengthParameterization": false }); println(tangent.origin); println(tangent.direction);

Debug pane output:(68.75 meter, 31.25 meter, 0 meter) (0.7071067811865475, 0.7071067811865475, 0)

This can be useful for performing further geometry calculations after drawing a spline. For completeness I’ll note that there is also a evEdgeTangentLines() function — which is identical but allows evaluating the curve at multiple points in one call — and an evEdgeCurvature() function — which returns not only a tangent but also normal and binormal vectors.

Now let’s add another point to the spline:

skFitSpline(sketch, "spline1", { "points" : [ vector(0,0) * meter, vector(10,10) * meter, vector(100,100) * meter ], "startDerivative" : vector(150,0) * meter, "endDerivative" : vector(0,150) * meter });

Here is the result:

Now there are two polynomial pieces, one that goes from (0,0) at t=0 to (10,10) at t=0.25, and one that goes from (10,10) at t=0.25 to (100,100) at t=1. The breakpoint between the two – called a *knot* – is at t=0.25.

Why is the knot at t=0.25? Well, this knot could actually be placed anywhere in ‘t space’, for instance t=0.1 or t=0.5, but as the second piece of the curve is much longer there is an argument for assigning more ‘t space’ to the second piece. Onshape uses the square root of the chord length between points as the metric; the lengths of the two chords here are 14.1m and 127.3m so the knot location is chosen as sqrt(14.1m)/(sqrt(14.1m)+sqrt(127.3m)) = 0.25.

The polynomial pieces are chosen such that both the first and second derivative are continuous through the middle point, which uniquely defines the two polynomials. The resulting curve ends up being the same as what would be produced by the following two skFitSpline calls:

skFitSpline(sketch, "spline1.piece1", { "points" : [ vector(0,0) * meter, vector(10,10) * meter ], "startDerivative" : vector(37.5,0) * meter, "endDerivative" : vector(8.4375,17.8125) * meter }); skFitSpline(sketch, "spline1.piece2", { "points" : [ vector(10,10) * meter, vector(100,100) * meter ], "startDerivative" : vector(25.312,53.438) * meter, "endDerivative" : vector(0,112.5) * meter });

… where I’ve calculated the required piece1 endDerivative and piece2 startDerivative so that both first and second derivatives match at the joining point (10,10). Note that all of the derivatives end up scaled when splitting the curve, as they are derivatives with respect to t and t is no longer the same variable (now *each piece* has t from 0 to 1).

As a corollary, if you only care about first derivative continuity and not second derivative continuity, then you can actually get a larger variety of curves by drawing the two parts individually: then you can choose any value for the piece1 endDerivative and piece2 startDerivative as long as the direction matches.

Actually Onshape represents all of these splines in terms of B-splines rather than as an array of polynomial co-efficients a,b,c,d,e,f,g,h for each piece. Contrary to popular belief, there is nothing voodoo about B-splines, they are just a tool for expressing splines in a more convenient form.

The basic idea is that we can separate a spline function into a set of control points **c**, which define an approximate path for the spline, and a set of functions **b**(t), which give the weighting of each control point at a parameter value t along the curve. The spline may not pass through all or even any of the control points **c**, but moving the control points allows you to control the path of the curve.

For example, the control points for the spline that we generated in the previous section are are depicted below (I’ll explain how these are determined from the skFitSpline() parameters in a moment):

If we think of **c** as a column vector of control points and **b**(t) as a row vector, then at any point we can evaluate the spline function as:

(3)

**b**(t), which weights the control points, is chosen such that it has a number of nice properties. The sum of components in **b**(t) should always be 1 for any t, which is required for scale-invariance (i.e. if you scale up the control polygon **c**, the spline should scale with it). Also, **b**(t) should be smooth everywhere (i.e. continuous first and/or second derivatives), such that **s**(t) is smooth everywhere.

A suitable function **b**(t) can be generated using what’s called the Cox-de Boor recursion formula. The only input to this algorithm is an ordered list of t values called knots – these are transition points between a set of underlying polynomials used to construct **b**(t).

To make it easy to experiment with B-splines, I wrote a simple Octave/MATLAB function that generates **b**(t) for a given polynomial degree and a given set of knots. The code is here: bsplinebasis.m. For example, for the previous curve with a knot at 0.25, we can evaluate:

```
octave:1> degree=3;
octave:2> knots=[0,0,0,0,0.25,1,1,1,1];
octave:3> bsplinebasis(degree, knots, 0.5)
ans =
0.00000 0.16667 0.44444 0.35185 0.03704
```

In other words, at t=0.5, the five control points of the spline are weighted according to this vector. Notice that here (and in fact anywhere beyond the first knot at t=0.25) the first control point has no effect; this is a B-spline property called local support.

I’ve failed to explain something. It makes sense that the knot locations are at t=0,0.25,1, which corresponds to the pieces of the spline, but notice that I’ve repeated the t=0 and t=1 knots four times on line 2. This is called knot multiplicity. In order for the spline to be “clamped” to the first and last control points, these knots must be repeated at least n+1 times (where n is the degree of the polynomial). A picture that illustrates this can be found on this website.

We can now draw the spline in Octave and confirm that it looks the same as in Onshape, and indeed it does:

octave:4> t=[0:0.01:1]; octave:5> cpts=[0+0i; 12.5+0i; -8.75+16.25i; 100+62.5i; 100+100i]; octave:6> plot(bsplinebasis(degree,knots,t)*cpts,'-',cpts,'--.')

In the above code I’ve expressed the control points as complex numbers x+iy as a convenience. I could also have done, equivalently:

octave:7> xcpts=[0; 12.5; -8.75; 100; 100]; octave:8> ycpts=[0; 0; 16.25; 62.5; 100]; octave:9> plot(bsplinebasis(degree,knots,t)*cxpts, bsplinebasis(degree,knots,t)*cypts)

Note that the control points are *not* the same as the points you specify to skFitSpline(). In this example, the specified points to skFitSpline() were (0,0), (10,10), (100,100) with startDerivative of (150,0) and endDerivative of (0,150), while the corresponding spline control points are (0,0), (12.5,0), (-8.75,16.25), (100,62.5), (100,100). How did I arrive at these control points? The first and last fit points become the first and last control points. The second and second-last control points can calculated from the startDerivative and endDerivative using equation set 1. The other control points are derived from the remaining fit points with linear algebra, using equation 3.

You *can* create a a spline with a given set of control points and knots directly using skSplineSegment() as follows:

// Not recommended, not a public API skSplineSegment(sketch, "spline2", {}); var guess = {"spline2": [ 0, // isClosed 0, // isRational 3, // degree 5, // numControlPoints 0,0,12.5,0,-8.75,16.25,100,62.5,100,100, // pts 0,0,0,0,0.25,1,1,1,1, // knots 0, // lowParameter 1 // highParameter ]}; skSetInitialGuess(sketch, guess); skSolve(sketch);

Note however that however this is not a public API and subject to change, so it’s better to use skFitSpline() where possible. If you want to define a spline using control points, you can easily convert it to skFitSpline() form by evaluating it at all of its knots, and calculating the startDerivative and endDerivative using equation set 2.

(For completeness, there are actually at least four internal functions that produce splines: skSpline(), skSplineSegment(), skInterpolatedSpline() and skInterpolatedSplineSegment(). The ‘Interpolated’ versions take a set of points the curve passes through plus the start and end derivatives, while the non-interpolated versions take control points. The ‘Segment’ functions produce an open spline with endpoints, whereas the non-segment functions can be used for closed splines without endpoints. skFitSpline() is a more user-friendly version of skInterpolatedSplineSegment().)

It can be seen that the B-spline form of a spline has a number of advantages over specifying polynomial co-efficients for each polynomial piece. Firstly the control points have a more intuitive geometric interpretation than the polynomial co-efficients. Secondly, if **b**(t) is chosen to have second-derivative continuity, then **s**(t) automatically has second-derivative continuity without needing to fit a set of co-efficients that satisfy this criterion. However there is nothing magic about B-splines: you can calculate the polynomial co-efficients from the B-spline control points, and vice versa, so they are just different representations of the same curves.

Note also that for a single-piece spline with a simple knot vector like [0,0,0,0,1,1,1,1], **b**(t) is exactly equivalent to the weighting that defines a Bézier curve, and the resulting spline is the same as the Bézier curve with the same control points. More complicated splines can be decomposed into multiple such Bézier pieces. So again there is nothing more or less powerful about these splines versus a set of Bézier curves joined together; however the B-spline representation requires less control points for the default case that second-derivative continuity is desired.

In fact I’ll admit to a little sleight-of-hand when preparing this article, where I’ve used the properties of B-splines to make my life easier. In one of the sections above, I split the curve created by skFitSpline() into its two equivalent pieces. The first time I did this, I did it from first principles — writing down the curves in polynomial form and setting the second derivatives equal at the joining point — and it took me a few pages of working. When I needed to redo it for the example spline in this article, though, I used an easier method which is to split the B-spline using a knot insertion algorithm. This is a method for inserting extra knots (and calculating updated control points) while preserving the same curve. By inserting two extra knots at t=0.25, the control polygon can be made to pass through the point (10,10) and we now have two explicit Bézier pieces. Here is the process I used in Octave:

octave:10> cpts,knots cpts = 0.00000 + 0.00000i 12.50000 + 0.00000i -8.75000 + 16.25000i 100.00000 + 62.50000i 100.00000 + 100.00000i knots = 0.00000 0.00000 0.00000 0.00000 0.25000 1.00000 1.00000 1.00000 1.00000 octave:11> [cpts,knots]=bsplineinsert(degree,cpts,knots,0.25) cpts = 0.00000 + 0.00000i 12.50000 + 0.00000i 7.18750 + 4.06250i 18.43750 + 27.81250i 100.00000 + 62.50000i 100.00000 + 100.00000i knots = 0.00000 0.00000 0.00000 0.00000 0.25000 0.25000 1.00000 1.00000 1.00000 1.00000 octave:12> [cpts,knots]=bsplineinsert(degree,cpts,knots,0.25) cpts = 0.00000 + 0.00000i 12.50000 + 0.00000i 7.18750 + 4.06250i 10.00000 + 10.00000i 18.43750 + 27.81250i 100.00000 + 62.50000i 100.00000 + 100.00000i knots = 0.00000 0.00000 0.00000 0.00000 0.25000 0.25000 0.25000 1.00000 1.00000 1.00000 1.00000 octave:13> plot(bsplineeval(degree,cpts,knots,t),'-',cpts,'--.')

Now control points 1 to 4 and control points 4 to 7 define two Bézier curves, and we can compute the required startDerivative and endDerivative values:

octave:13> startDerivative1 = (cpts(2)-cpts(1))*3 startDerivative1 = 37.500 octave:14> endDerivative1 = (cpts(4)-cpts(3))*3 endDerivative1 = 8.4375 + 17.8125i octave:15> startDerivative2 = (cpts(5)-cpts(4))*3 startDerivative2 = 25.312 + 53.438i octave:16> endDerivative2 = (cpts(7)-cpts(6))*3 endDerivative2 = 0.00000 + 112.50000i

I’ve uploaded the code for bsplineinsert.m to my GitHub along with some other B-spline utility functions.

Hopefully in this instalment you’ve learnt a few things about splines and how they can be represented in Onshape… I know I have. In part 2, I discuss offset splines, derivatives of splines, creating splines directly in 3D, and some other tidbits about Onshape internals.

]]>I recently purchased a pair of Dell Venue 11 Pro 7140 tablet computers — one for myself and one for my girlfriend. I figured this would be a good crossover device between a tablet and a laptop, and so far I’m not disappointed. One important reason I chose this model is because they are more easily serviceable than other comparable offerings like the Microsoft Surface Pro line; the Surface Pro is glued together in a way that is difficult to disassemble and reassemble, which makes me very unhappy. Batteries have a limited service life (as I know all too well from my last post), and I don’t want to throw away a perfectly good tablet in a couple of years just because the battery is not holding a charge. Even if you don’t have any interest in taking apart your own devices, I would encourage you to consider the fixability of your device and vote with your wallet against throw-away devices. Anyway, enough of that rant.

Together with my Dell tablet I bought a keyboard. There’s two types of keyboard available for the Dell Venue 11 Pro: a “slim tablet” keyboard (K11A), which is just a keyboard and touchpad, and a “mobile” or “travel” keyboard (K12A), which additionally has a battery in it. I chose the latter version with the battery inside so that I could get more battery life out of my tablet.

I ran into an interesting problem with the travel keyboard, or at least my unit. Sometimes after attaching it to the tablet, the touchpad didn’t work. Sometimes the whole keyboard wouldn’t work. Strangely though, if I disconnected the battery inside the keyboard or let it drain completely, it would always work reliably, so I knew that it was some interaction with the voltage from the battery.

Most people at this point would call Dell, and unless you’re an electronics geek this is probably the best option. But I thought I would try to track it down.

The first thing I did after removing the cover was to probe the 6-pin touchpad connector, since that was the most obvious point of brokenness. The pinout for this connector looks approximately like this:

Pin | Description |
---|---|

1 | I/O (PS/2 clock?) |

2 | ground |

3 | I/O (PS/2 data?) |

4 | connected to 3.3V via 0 ohm jumper |

5 | I/O (touchpad button?) |

6 | 3.3V |

In normal working operation, five of the six pins — all except ground — probed as 3.3V, which is the normal idle state of the PS/2 bus. In the broken state, pin 3 was stuck low (0V). Essentially the touchpad PS/2 bus was in a locked up state. Note that the button may still work in this state (as I recall someone else on the Internet noting).

Most interesting was what happened when the keyboard was disconnected from the tablet, leaving it powered only by its internal battery. Now all the 3.3V pins dropped to 2V. This was likely the cause of the PS/2 bus lockups: the components running off that voltage supply (touchpad, keyboard, and associated microcontroller) were unlikely to work reliably at 2V. When the tablet was plugged back in, the voltage rose back to 3.3V, but the microcontroller and/or touchpad were sometimes in a bad state from their sojourn at 2V.

I initially thought that that the 2V might be due to a “back powering” issue (“back powering” occurs when an unpowered IC has voltage applied to some I/O pins, and current flows from those I/O pins into the power rail). However, I couldn’t find any ICs that were connected in a way that would cause back-powering. Rather it seemed that there was some issue with the circuit that turns the power on and off.

The circuit that controls power to the keyboard/touchpad looks like this:

This type of circuit is what’s known as a “high-side switch”. The element doing the actual switching is Q2, a P-channel MOSFET transistor, and Q1 is an N-channel MOSFET acting (together with R2) as an inverter.

When the keyboard is attached to the tablet, ENABLE (USB_5V) is 5V → transistor Q1 is on → node X is pulled to ground → Q2 turns on fully → OUT is 3.3V. **This was working as expected in my keyboard.**

When the keyboard is disconnected, USB_5V is floating and pulled down to ground by R1 → Q1 is off → node X should be pulled up to 7V by R2 → Q2 should turn off fully → OUT should float to 0V assuming nothing else is powering it. **This was not working correctly in my unit. Instead, node X was at 2.7V and OUT at 2.0V.**

(N.B. the “7V” I refer to may not be exactly 7V; it’s the battery voltage minus a bit so may be higher or lower depending on how much charge is in the battery.)

My first thought was that the 100K pull-up resistor might be too high, and the leakage current through the three transistors was dragging down node X from 7V to 2.7V (it only takes 40 μA to drop 4V across 100K, and in my world 40 μA is tiny).

I removed R1 from the board and replaced it with a 65K resistor instead, thinking that would solve the problem. To my great surprise node X remained around 2.7V.

I then tried temporarily connecting node X to 3.3V with a 10K resistor to make the pull-up even stronger, and it still barely moved above 2.7V.

The value 2.7V is about a diode drop below 3.3V which I thought might be a clue. To determine whether this was relevant, I tried an experiment with the 3.3V rail off, connecting node X to 2V from a current-limited power supply. Now both the 3.3V rail and the OUT rail were hovering at 1.2V, and node X was sinking about 1.2mA. Where was this current going? Q1 should have been off, and Q2 and Q3 are oriented with their gates connected to node X so should not be sinking any current from it (the gate of a MOSFET is insulated from the other terminals so the gate current should be essentially zero).

I considered three options:

- Q2 was not a P-channel MOSFET but something else. (I couldn’t determine the part number for any of Q1/Q2/Q3; part markings on these tiny SMD parts often have no correlation with the part numbers, so working backwards from markings is difficult unless you happen to stumble on the right datasheet.)
- There was something else connected to node X that I hadn’t found in my tracing of the board.
- One of the transistors was broken and leaking current.

Testing components in-situ is tricky as multimeter readings are affected by anything else connected to the nodes, for example node X has a capacitor that will sink current until charged up. After a day of head-scratching and verifying my schematic, I decided that (c) was most likely and I pulled Q2 off the circuit board with a dose of hot air.

It was immediately apparent that Q2 was broken as its gate-to-source and gate-to-drain resistances were only a few kiloohms. In a MOSFET, the gate is behind an insulator, so these resistances should be megaohms. **The low resistance from gate-to-source and gate-to-drain suggests that the gate oxide in this MOSFET is damaged.** (Interestingly, the resistance is different in the gate-to-source direction vs the source-to-gate direction, and similarly for gate-to-drain vs drain-to-gate. This makes sense if you consider that a MOSFET, under the oxide, contains two p-n junctions like a BJT transistor.)

After replacing Q2 with a new P-channel MOSFET, the output voltage now turns off fully, and my keyboard and touchpad work normally.

One thing I learnt from this debugging process is that MOSFETs can fail in interesting ways: not just ‘on’ or ‘off’, but with current leaks that behave partly like resistors and partly like p-n junctions, reflecting the internal construction of the device.

As to why this device failed, I’m still not sure. My best guess from the marking (N1D3C) is that it’s an Si2301CDS which, if correct, seems perfectly up to the task. Most MOSFET power switches fail due to thermal stresses, but the current delivered by this circuit is very low compared to what this transistor can handle. I also doubt that there are any transient voltage spikes as both input and output have capacitors. My only thought is that the pull-up to around 7V (V_{GS(off)} around +4V) may contribute to stress on the gate oxide over time, compared to if it was driven more conservatively with 3.3V (V_{GS(off)} = 0V). The purpose of driving the gate to +4V when off is likely to reduce off-state power consumption, by “over pinching” the channel to reduce leakage. However it may also increase the likelihood of charge carriers migrating into the gate oxide. The datasheet specifies V_{GS} = +8V as the “absolute maximum” beyond which permanent damage may occur, however there will also be a threshold beyond which device reliability is affected, which is unfortunately not stated in the datasheet. It may be that keeping the device for extended periods at V_{GS} around +4V — as will be the case whenever the keyboard is off — is not a good idea. It’s clear that the designer wasn’t sure what the right answer was, either, since they’ve included pads for pulling up to both 3.3V and to 7V. Presumably a lot of people have this keyboard and it works just fine, so there is some element of silicon defects or bad luck as well.

Anecdotally it seems that a few other people on the Internet have had similar issues to mine, although I’m not sure whether they all stem from the same root cause. If you have this exact same behaviour (your Venue 11 Pro touchpad doesn’t work when the battery is charged but works reliably when battery is flat), and if you’re an electronics geek and you want to open your keyboard, I’d love to know whether your voltages at X and OUT have a similar anomaly to mine. Of course, if you can get it fixed under warranty then you should probably do that before risking voiding your warranty

]]>If you are just joining this story you may want to start at part 1.

In part 2, we discovered that a embedded controller update is performed by uploading a small ‘flasher’ program to the EC. This flasher program is then responsible for programming a new firmware image to the EC’s internal flash memory. However, both the flasher program and part of the firmware image are encrypted: the old (currently running) EC firmware decrypts the flasher program, and the flasher program then decrypts the new firmware update. This creates a bit of a chicken-and-egg problem that prevents discovering the encryption algorithm from firmware update files alone.

We managed, however, to find a decrypted version of the EC firmware online, dumped directly from EC flash memory via JTAG. Let’s dive right in and disassemble the decryption function that can be found in that flash image. The core of it looks like this:

2854: 30 25 8d 1f 00 00 48 14 ld r13,[r13,0x1448] 285c: 30 21 81 0f 00 00 48 10 ld r1,[r1,0x1048] 2864: cf 7e extb_s r14,r14 2866: 02 be asl_s r14,r14,2 2868: b9 61 add_s r1,r1,r13 286a: 30 26 8d 1f 00 00 48 18 ld r13,[r14,0x1848] 2872: 4f 7f extb_s r15,r2 2874: 02 bf asl_s r15,r15,2 2876: a7 79 xor_s r1,r1,r13 2878: 30 27 8d 1f 00 00 48 1c ld r13,[r15,0x1c48] 2880: b9 61 add_s r1,r1,r13

Here each input byte is transformed through a lookup table (in cryptography terminology, a substitution box or ‘S-box’) and the results are combined with an add/xor/add structure. This is the Blowfish cipher, as becomes evident from one glance at the diagram in the Wikipedia article on Blowfish:

[Diagram by Decrypt3/DnetSvg via Wikimedia, CC BY-SA 3.0]

Now normally the first stage of Blowfish, like most ciphers, would be to expand a user-provided key – a password or passphrase or some other secret data – to produce a set of 18 round keys (called the ‘P array’ in Blowfish terminology) and the four 256-entry S boxes depicted above. In cryptography this expansion step is called a key schedule.

In the case of the Lenovo firmware, it turns out that the keys are stored in pre-expanded form, i.e. the values of the P array and S boxes are stored in flash memory rather than the original key string. We can extract the P array and S boxes from the dumped flash image and use them for encryption/decryption.

(I do not believe there is any easy way – where in cryptography easy means ‘significantly better than trying all possibilities’ – to recover the original key string that was used to generate the P array and S boxes. This would be an interesting challenge, but is of purely academic interest: the P array and S boxes are all that is needed for both encryption and decryption. Each round of Blowfish involves taking the data, mixing in [exclusive or] one of the round keys from P and then performing the function depicted above using the S boxes; this is repeated 16 times; apart from some minor details you now understand how Blowfish works.)

Having the encryption algorithm and keys, we can now decrypt the flasher program that is uploaded to the embedded controller when performing a firmware update.

Analysis of the flasher program allows us to determine the checksum algorithm used to validate the firmware image. Even without a detailed analysis of the disassembly, we can notice that it uses the following table:

uint16_t csum_table[256] = { 0x0000, 0x1021, 0x2042, 0x3063, 0x4084, 0x50A5, 0x60C6, 0x70E7, 0x8108, 0x9129, 0xA14A, 0xB16B, 0xC18C, 0xD1AD, 0xE1CE, 0xF1EF, 0x1231, 0x0210, 0x3273, 0x2252, 0x52B5, 0x4294, 0x72F7, 0x62D6, 0x9339, 0x8318, 0xB37B, 0xA35A, 0xD3BD, 0xC39C, 0xF3FF, 0xE3DE, ... };

A quick Google for these numbers gives this secret away easily: this is a lookup table for a CRC-16 function with a generator polynomial of 0x1021. At this point we can be pretty sure that the algorithm is CRC-16, with only a few details to determine like initialisation value, but let me indulge myself and make a brief digression into what a CRC is and why the table looks like this. You can skip to the next section when this gets too dense for your liking.

The first thing to note is that when people refer to a CRC-16 generator polynomial of 0x1021, what they actually mean is 0x11021, i.e. binary 10001000000100001, i.e. x^16 + x^12 + x^5 + 1. (The x^n term of a CRC-n generator polynomial is always assumed to be 1, otherwise the following algorithm would not produce an n-bit CRC.)

Now the best way to imagine calculating a CRC is, for every 1 bit that is set in the input message, you XOR in the polynomial (towards the right, if working left to right). This clears that bit and flips some subsequent bits: for the above polynomial, it flips the three bits at N+4,N+11,N+16. If those bits end up as 1 (when you come to them), they will in turn propagate to other bits. Once you reach the end of the message, you’ve cleared all the bits in the message, but you have some bits hanging off the end: these are the CRC. You can intuitively see from this that the CRC has error propagating properties; an erroneous bit will likely cascade to many other bits. (Engineers might be reminded of a linear feedback shift register which has many similarities. The CRC can also be expressed mathematically as the remainder of a long division in carry-less binary arithmetic [GF(2)], the similarity to long division will become apparent in the examples below.)

As an example, if the input message to our CRC is one byte with value 5 (00000101), then the CRC calculation would proceed like this, where I have highlighted in red the leading 1 at each step:

(5) 00000101| xor 100|01000000100001 (0x11021) ----------------------- 00000001|01000000100001 xor 1|0001000000100001 (0x11021) ------------------------- 00000000|0101000010100101 (0x50a5)

Here is another example for an input byte with value 16 (00010000):

(16) 00010000| xor 10001|000000100001 (0x11021) --------------------- 00000001|000000100001 xor 1|0001000000100001 (0x11021) ------------------------- 00000000|0001001000110001 (0x1231)

For larger messages doing this bit by bit would be slow, of course. To do it more efficiently we can pre-calculate the effect of a byte on the next sixteen bits (this will be some pattern of bit flips), for each of the 256 possible values of that byte. We then look up the first byte, XOR that in to the next sixteen bits, move to the next byte and repeat. Actually the “effect of the first byte on the next sixteen bits” is exactly equivalent to the CRC of that byte – the CRC is the pattern of bit flips that ends up past the end of a message – so the lookup table is in effect a table of CRCs of single bytes.

Looking at the above examples you can verify that the CRC of 5 is indeed the entry with index 5 in the table above, and the CRC of 16 is indeed the entry with index 16. If you stare at the first example for a moment, you might see why the first sixteen entries of the table are multiples of 0x1021: all the 1 bits in 0x11021 are far enough apart that the input bits propagate independently into the CRC. Things change at entry 16 because the next 1 in the polynomial crosses over into the message.

Returning to the firmware, we can verify that the flasher checksum is in fact CRC-16. It’s important to note that the flasher program calculates this checksum *after* decrypting the firmware image, so the CRC-16 must be calculated on the *decrypted* version of the image. (It turns out that the firmware image is encrypted using the same key as the flasher program, so at this point we already have everything we need to know.)

I mentioned in part 2 that a simple 16-bit checksum is also present at the very end of the firmware image. In fact, there is also a third type of checksum that we discover when we disassemble the EC boot code: four 32-bit checksums, each performed on a different area of the flash memory.

Summarising now, the three EC checksums that must be correct are:

- The
**outer checksum**: BIOS checks the checksum of the (encrypted) image before uploading it to the flasher. The last two bytes of the image are calculated such that the total sum of the 16-bit words of the image is zero. If this fails, flashing doesn’t proceed, so this is the least dangerous checksum to fail. - The
**flasher checksum**: The flasher calculates the CRC-16 checksum of all the*decrypted*bytes except the last four bytes. The penultimate two bytes of the image contain this checksum. If this checksum fails, the flasher sets a flag in flash that causes the EC firmware to fail to boot. - The
**boot checksums**: The EC firmware, at boot, calculates 32-bit checksums of four different areas of the flash. If any of these fail the EC firmware fails to boot. These checksums must, of course, also be calculated on the*decrypted*image.

We now have everything we need to know to successfully modify the embedded controller firmware. Returning to my original goal, here is the change I made to the battery authentication check (before/after, with some comments added):

; call validation function on battery response 1d168: ee 0f af f2 bl.d 0x2954 ; validate 1d16c: f8 16 02 11 ldw r2,[r14,248] - ; branch to failure path if return value not equal to 1 - 1d170: 0b 08 51 00 brne r0,1,0x1d17a + ; branch replaced with no-operation instruction + 1d170: 4a 26 00 70 nop ; success path - set bit 3 in battery status 1d174: 00 86 ld_s r0,[r14,0] 1d176: 83 b8 bset_s r0,r0,3 1d178: 24 f0 b_s 0x1d1c0 ; failure path - try a different validation function, ; else retry up to 8 times, else abort 1d17a: 10 10 80 20 ldb r0,[r16,16] ...

Previously, if the return value from the validation function was not equal to 1, the code would branch to a failure path. I replaced this conditional branch instruction with a no-operation instruction so that, regardless of the validation result, execution would fall through to the success path. (This technique of replacing jumps with nops is a common trick when you need to modify the behaviour of binary code.)

(At this point my focus was on the first authentication sequence – state 12 in the firmware state machine that I listed in part 2.)

Also, for fun and so I could track my modified firmware version, I changed the embedded firmware version from “GCHT25WW” to “GCHT25MC”. I thought putting my initials in the geographic region field would be appropriate: the original suffix, WW, presumably means worldwide, while this firmware had… somewhat more limited geographic reach.

Having made my changes, I wrote some small utilities to regenerate the three types of checksums – I have published these utilities on GitHub – and finally I re-inserted the modified embedded controller firmware into the BIOS update .FL2 file at offset 0x500000.

If you are attempting this, make sure you check and double-check that you’ve modified exactly what you intended to modify. For Linux users, process substitution can be handy here, e.g. to get a binary diff of two files you can do: diff -u <(xxd file1.bin) <(xxd file2.bin).

I ran the BIOS update program and was greeted with:

An update is not necessary at this time. The process has been canceled.

Thwarted. I was already running the latest BIOS version and the update program did not offer an option to force an update of the BIOS or EC. Downgrading and upgrading might usually be a workaround, but I wasn’t sure that this would be possible as the Lenovo release notes mention that, “if the UEFI BIOS has been updated to version 2.63 or higher, it is no longer able to roll back to the version before 2.63 for security improvement”.

Fortunately, it turns out it’s possible to run the update program manually, bypassing the limited user interface.

*A note: As I do not have Windows on this laptop, I am running the BIOS update program from a USB flash drive following some instructions I found online (in short: download the bootable CD image; run geteltorito -o bios.img gcuj23us.iso; write the bios.img file to the USB stick). I suspect the below process is even easier from Windows, where you can directly run winflash32.exe or winflash64.exe in place of dosflash.exe.*

The Lenovo BIOS update CD is just a DOS boot disk at heart. If you’ve written it to rewritable media like a USB flash drive, you can edit autoexec.bat to stop it starting the user interface, in which case it will drop you to a DOS prompt.

Updating the EC firmware can be performed using dosflash.exe as follows (/sd is skip date check to allow downgrading, /ipf ec targets the EC area):

C:\FLASH> dosflash /sd /ipf ec /file GCETA3WW\$01DA000.FL2

SCT Flash Utility for Lenovo

for Shell V1.0.1.3

Copyright (c) 2011-2012 Phoenix Technologies Ltd.

Copyright (C) 2011-2012 Lenovo Group Limited.

Read BIOS image from file.

Initialize Flash module.

Read current BIOS.

Oem check

Prepare to flash “ec”

Do not turn off computer during the update!!!

Begin Flashing……

Total blocks of the image = 48.

|—+—-+—-+—-+—-+—-+—-+—-+—-+—-|

…………………………………………..

Image flashing done.

Flashing finished.

BIOS is updated successfully.

WARNING: System will shutdown or reboot in 5 seconds!

A note here on the FL1 and FL2 files since I couldn’t find any explanation about this on the Internet: the FL1 file is a UEFI capsule file that contains the BIOS. The FL2 file contains embedded controller firmware at offset 0x500000-0x530000, the rest of it can be ignored. Why then is the FL2 file so large and why does it contain bits of an old BIOS version pasted after the EC firmware? I think partly it may be to appease the non-Lenovo-specific parts of dosflash.exe. I noticed that even though ultimately it only uses the 48 4KB blocks from 0x500000-0x530000, if I pass a file that ends there, dosflash.exe does not recognise it as a valid BIOS update file.

(While the command shown above updates the EC, I will note here that it is also possible to update the *BIOS* in a similar way, by omitting /ipf ec and by specifying the FL1 file instead of the FL2 file: dosflash /sd /file GCETA3WW\$01DA000.FL1

Of course I recommend using the the normal manufacturer-recommanded BIOS upgrade/downgrade process when possible, but this may be useful if you are in a bind.)

Note that, despite what the output of dosflash.exe says, the actual EC firmware is not updated yet: at this point it has just written the update to an area where it can be picked up by BIOS at boot. Now after reboot the screen displays:

Flashing Embedded Controller…

Please do not power off!

(I appreciate the ‘please’, but really that should read “**DO NOT UNDER ANY CIRCUMSTANCES POWER OFF**, EVEN IF YOUR HOUSE IS ON FIRE AND SOMEONE HAS STOLEN YOUR PANTS”.)

A few skipped heartbeats later, the firmware update completes.

Sure enough, the system is now running my modified embedded controller firmware.

But my battery still isn’t charging. I hook up the SMBus signals to my logic analyser again and the communication looks like this:

... START 16 (Control Byte: Slave Address B Write) 3C 04 NACK STOP START 16 (Control Byte: Slave Address B Write) 3C 04 NACK STOP START 16 (Control Byte: Slave Address B Write) 3C 04 NACK STOP START 16 (Control Byte: Slave Address B Write) 3C 04 NACK STOP

It turns out that it isn’t even getting to the authentication check that I had modified, because the earlier command that sends the challenge to the battery is failing: as soon as the laptop sends command 3C and the data length indication 04, the battery is signalling NACK – not acknowledged – go away. So now I modify the state machine so that it proceeds whether or not that write command succeeds (again I do this by replacing a jump by a nop, this time in state 8).

Revision two deployed. Now the system boots with:

The battery installed is not supported by this system and will not charge. Please replace the battery with the correct Lenovo battery for this system. Press the ESC key to continue.

Well, that’s some sort of progress: at least it is no longer displaying the original unauthorised battery message.

I look in BIOS to see where these messages are coming from. Both this message and the original unauthorised battery message are displayed by LenovoVideoInitDxe.efi: don’t ask me why this code is in this module rather than somewhere more relevant (may I suggest LenovoAnnoyingBatteryMessageDxe.efi?), but it might have been convenient to put it in the video initialisation module as the message is displayed when the screen is cleared post-POST.

Anyway, in LenovoVideoInitDxe.efi it reads the battery status from the EC (register 0x38, which we came across in part 2 when decoding the ACPI tables, as well as register 0xd1 which has some additional battery status flags). Depending on certain bits in those registers, it may print one or other message.

Tracing through the EC code to find where those bits are set proves difficult, so I again hook the battery up to my logic analyser. It becomes evident that the sticking point is now the second authentication sequence.

I now make the same changes to the second authentication sequence as I did for the first: I prevent the write command from failing even if the battery NACKs it (state 15), and force the check of the response to succeed (state 19). This is now four changes in total, for each of which I’ve replaced a jump with a nop.

After booting to this final EC firmware revision, my saga comes to an end, almost anticlimactically. My replacement battery works, and I’m getting a good few hours out of it (and no, it hasn’t burst into flames).

There is still one very curious open question that I haven’t managed to figure out. There are 10 random-looking bytes at offset 0x200 of the firmware image – the start of what looks like a firmware information block – which are different in every EC revision. So far I haven’t found anything that accesses those bytes, and indeed my EC update works fine even when I leave them unchanged. Probably this is a red herring, but what these bytes are for is still a mystery.

I have uploaded my utilities to GitHub so that it is possible to replicate my work and possibly make other changes to EC firmware. (Edit, Jan 2017: Hamish Coleman has built some additional utilities around my code that may also be useful, see hamishcoleman/thinkpad-ec. If you are only interested in the battery patch you will need to disable his keyboard patches.) The exact changes I made to my X230T firmware are also available as a bspatch file here. However a large disclaimer applies to all of this: **do not attempt EC firmware modification at home unless you understand what you are doing. If something goes wrong with an EC update, there is a high likelihood of bricking your laptop, the only recourse being connecting to the EC via JTAG. I will not be held responsible for this. You should also understand that poor quality lithium ion cells can cause fires, as has been seen in the recent spate of hoverboard fires. I will also not be held responsible for this.**

I have since also torn down my old Lenovo battery, and I plan to write another post soon with some information and photos. The option of replacing the cells in a genuine battery may be worth considering as an alternative to modifying the EC firmware, the advantage being is that you can choose your own high quality Li-Ion cells versus whatever you might happen to get in a replacement battery. The disadvantage is that it inevitably results in some damage to the battery casing, and as I mentioned before the controller will remember data about the old cells which might affect function of the new cells (I will see what I can figure out on that front).

To be fair, buying a genuine Lenovo battery is probably the best option for most people, at least while Lenovo is still making replacement batteries for this model. Primarily this was an exercise in ‘because I can’: I and a great many readers have enjoyed this process of diving deep into the system architecture of a modern laptop, at a level that few people are normally exposed to, and removing an annoying limitation that I feel I should be entitled to remove from *my* laptop. I do not have anything against Lenovo specifically – they are certainly not the only vendor who implements vendor restrictions – so please be nice in your comments.

In part 1, we looked at the communication between a Lenovo Thinkpad X230T laptop and battery, and discovered that there a challenge-response protocol used to authenticate ‘genuine’ Lenovo batteries. On the laptop side, this – and battery communication in general – is implemented in a component known as the embedded controller (EC).

The embedded controller is a low-power embedded microprocessor that could be considered part of the ‘autonomous nervous system’ of a laptop. The embedded controller helps to control background functions such as power management, temperature monitoring, fan speed control, etc., and may stay powered even when the system is switched off. For historical and pragmatic reasons the embedded controller is also usually the same microcontroller as the system keyboard controller.

The ACPI (Advanced Configuration and Power Interface) standard defines an interface that allows an operating system to interact with the embedded controller in order to monitor and configure its power management functions, however the implementation of the embedded controller is proprietary and completely up to the system vendor. It is not always even clear what chip is used for the embedded controller in any particular computer.

Searching online, a number of other people have been interested in modifying ThinkPad embedded controller firmware to alter keyboard and fan behaviour. In earlier ThinkPad laptops which use Renesas H8S microcontrollers, the firmware has been analysed in detail and even modified successfully.

After downloading and poring over the BIOS update image for my X230T, I found something that looked like EC firmware at offset 0x500000 in the file $01DA000.FL2: the data that followed was 192KB in size and contained the string “GCHT25WW” which is the embedded controller firmware version. I tried disassembling this firmware image as Renesas H8S instructions but with no success. Either I was looking in the wrong place or Lenovo was no longer using the H8S.

After some more Internet searching I found a link to purported “Dasher 2” (X230) schematics (this is not exactly the same as my laptop, which is an X230T, but is very similar). These schematics show the embedded controller being the Microchip MEC1619. The MEC1619 contains an ARC625D microprocessor and 192KB of flash memory, which indeed matches the 192KB size of the suspected firmware image. I now tried disassembling the firmware image according to the ARC625D instruction set (ARCompact) and sure enough: these were definitely ARC instructions, and the embedded controller in my laptop was almost certainly a MEC1619.

I could not find the full datasheet of the MEC1619 online, however apart from the ARC processor core it is similar to the MEC1322 (which uses an ARM core) and the MEC140x/MEC141x (which uses a MIPS core). I found it useful to refer to those datasheets to understand the general architecture and peripherals available. (There even seem to be similarities in the memory maps: for example, the ACPI EC Interface is located at 0xF0D00 on the MEC140x and at 0xFF0D00 on the MEC1619.)

After successfully disassembling the EC firmware as ARC instructions, I searched through the disassembly for the command sent in the first authentication sequence (0x3c hexadecimal, or 60 in decimal). The value 60 occurred several times but I found one very good candidate:

1d0e8: c9 70 mov_s r0,r14 1d0ea: 0a d9 mov_s r1,10 1d0ec: 3c da mov_s r2,60 1d0ee: 04 db mov_s r3,4 1d0f0: 18 f1 b_s 0x1cf20 1cf20: 0d ff bl_s 0x1cb54

Here r2 is set to 60 (0x3c) and r3 is set to 4, which is indeed the length of the data sent with the first authentication command. Searching for other branches to the same location, nearby I also found:

1d25e: c9 70 mov_s r0,r14 1d260: 0a d9 mov_s r1,10 1d262: 27 da mov_s r2,39 1d264: bd 04 ef ff b.d 0x1cf20 1d268: 11 db mov_s r3,17

Here r2 is set to 39 (0x27 in hexadecimal – the second authentication command) and r3 is set to 17, which is indeed the length of the data sent with the second authentication command.

(If you’re wondering about this last code sequence and the last move instruction being *after* the branch, the ARC architecture, like MIPS and others, has branch delay slots: the b.d instruction executes the instruction following the branch while the branch is being taken, whereas a regular b or b_s instruction nulls it out. If this makes no sense to you, don’t worry about it. The _s after instruction names refers to the ‘short’ 16-bit encoding of the instruction and has no effect on the function of the instruction.)

So now I had a good lead for the part of the EC code that was querying the battery. I further analysed the code around that point, and this function indeed turned out to be a state machine that queried the battery. The relevant states look approximately like this:

state 7: start write command 0x3c (with 4-byte challenge)

state 8: check success, retry if necessary

state 9: unknown

state 10: start read command 0x3c

state 11: check success, retry if necessary

state 12: validate battery response

state 13: choose challenge number

state 14: start write command 0x27 (with 17-byte challenge)

state 15: check success, retry if necessary

state 16: unknown (same as state 9)

state 17: start read command 0x28

state 18: check success, retry if necessary

state 19: validate battery response

state 20: set battery status

At this point, I knew roughly what I had to patch (states 12 and 19 in particular). So far so easy, but this was only the start. The difficult part is that the firmware has checksums to guarantee integrity that I would have to update after making any changes.

I extracted the embedded firmware images from the 20 past BIOS versions for the X230T (which contained 8 different embedded controller firmware versions in total) and compared them to find areas that might be checksums. The last four bytes of the EC firmware image clearly appeared to be a checksum, and there were some other locations that consistently varied as well. I guessed (correctly) that if I programmed an image with the wrong checksums the EC would fail to boot and I would have a brick on my hands, so trial and error was not a very good option.

For the last four bytes, I experimented with various 32-bit checksum algorithms (summing up the words, exclusive or’ing the words, CRC32, Adler32, etc.) without success. I looked at the earlier work mentioned above that had successfully analysed the Renesas H8S firmware, and that firmware used simple 16-bit checksums. In fact, a 16-bit checksum did work: the last two bytes of the image were chosen so to make a simple 16-bit sum of the whole image add to zero. But I couldn’t figure out how to calculate the other two bytes of the last four bytes. All four bytes changed together in firmware updates, so there was likely still some secret to be discovered.

There was something else interesting: it appeared that parts of the EC firmware image were encrypted. After receiving the authentication response from the battery, the routine called a validation function, but that function (and surrounding code) looked like garbage when disassembled:

1d168: ee 0f af f2 bl.d 0x2954 ; validate 2954: 57 8b ldb_s r2,[r3,23] 2956: 9f a1 st_s r12,[r1,124] 2958: 5c 6a asr_s r2,r2,4 295a: 54 2b 5f ed vadd2h.f blink,r51,53 295e: 73 2f 40 f7 qmpywhu.f r0,pcl,29 2962: e9 3d a4 eb ???.n.f r53,r53,46 2966: 26 04 86 d1 bnc 0xfffa5d8a

(This disassembly makes no sense, it cannot possibly be what the processor executes.)

Passing that part of the firmware image through the Linux ‘ent’ utility, the entropy was over 7.9 bits/byte compared to the rest of the image which had an entropy of about 5.9 bits/byte, making it very likely that that encryption was involved (high entropy could also indicate compression, but that wouldn’t make sense here as there is a branch directly into the middle of it). Also, from a certain address onwards, the rest of the encrypted data changed completely in every firmware revision, suggesting cipher block chaining with a most likely block size of 64 bits.

But the EC firmware executes directly out of flash memory, and the MEC1619 datasheet says nothing about the processor supporting decryption on-the-fly (which would be difficult to implement in any case). I assumed therefore that the EC code must be stored in flash memory in *decrypted* form, while the firmware update image is partially encrypted to protect it from prying eyes like mine. Additionally, if the second half of the checksum was calculated on the decrypted version rather than the encrypted version I had, this would explain why I was finding it hard to determine the checksum algorithm.

I started looking into how the process of updating the EC firmware works, in the hope that it would give me more insight into how the flash memory was accessed. In order to understand the firmware update process, it is useful to have some background knowledge about how the CPU and EC communicate.

At a physical level, the embedded controller is connected to the LPC bus. The LPC bus is like the cassette tape deck of the modern laptop: it is the last remaining remnant of the legacy ISA bus that preceded PCI that preceded PCI-Express. The LPC bus is still used for a number of miscellaneous devices for which speed is not critical. The physical topology in the X230T looks like this:

The ACPI standard defines a standard access method for communicating with the embedded controller. This access method involves writing commands to one I/O port and reading/writing data from/to another I/O port (on x86 these I/O ports are generally I/O ports 0x62 and 0x66). There are two main commands defined: Read Embedded Controller (0x80) and Write Embedded Controller (0x81). These commands allow reading or writing one of 256 locations in the embedded controller, each 8 bits in size. (For completeness, I will mention that there are a few other minor commands defined in the ACPI specification, but those are not implemented by the Thinkpad EC.)

It is possible to read and write these EC locations in Linux using the ec_sys kernel module and the ec_access utility:

$ sudo modprobe ec_sys $ sudo ./ec_access -r 00 01 02 03 04 05 06 07 08 09 0A 0B 0C 0D 0E 0F 00: a7 09 a0 c2 00 86 05 00 00 00 47 00 00 03 80 02 10: 00 00 ff ff d0 fc 00 01 7b ff 01 f0 ff ff 0d 00 20: 00 00 00 00 00 00 00 b8 00 00 00 00 e8 00 00 80 30: 00 00 00 00 31 04 00 00 a4 00 20 10 00 50 00 00 40: 00 00 00 00 00 00 14 c6 02 84 00 00 00 00 00 00 50: 00 80 02 0c 00 01 02 03 04 05 06 07 f5 1b 76 1c 60: 6e 95 f9 57 05 00 00 00 00 00 00 00 00 00 00 00 70: 00 00 00 00 08 00 00 00 2d 00 00 00 00 00 00 00 80: 00 00 05 06 00 00 03 00 00 00 00 00 00 00 2b 00 90: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 A0: c0 00 c0 00 ff ff 64 00 00 00 3e 31 ff ff a0 02 B0: 00 00 00 00 00 00 00 00 00 00 41 05 01 18 01 00 C0: 00 00 00 00 00 00 00 00 01 41 00 00 00 9b 00 00 D0: 15 c0 00 00 00 00 00 00 00 00 00 00 00 00 00 00 E0: 00 00 00 00 00 00 00 00 11 40 da 16 e4 2e 44 03 F0: 47 43 48 54 32 35 57 57 1c 67 63 0d 00 00 00 00

What are these 256 bytes? Well, in actual fact, this is entirely implementation-specific. ACPI *only* defines the access method, it does not define any of the 256 locations that might be presented by the EC. You can see the embedded controller version (“GCHT25WW”) at offsets f0-f7, but other than that this data is pretty opaque without having further information.

In order to work out what registers to access to perform a desired function – for instance, querying the system battery – the operating system uses ACPI *tables*. The ACPI tables are defined by the system manufacturer, in this case Lenovo, and the BIOS passes these tables to the operating system at boot. We can dump and decode the ACPI tables from a live system using the following commands (make sure you do this in a new directory as it will dump out a lot of files):

$ sudo acpidump -b $ iasl -d *.dat

To query a system battery, the ACPI specification defines that the operating system should invoke a method called _BST (short for Battery Status). Below I’ve included an extract from my Thinkpad X230T’s ACPI tables which shows how _BST is implemented on the Thinkpad. (I have omitted much of it for readability, if you are interested in more detail [God help you] I encourage you to dump your own system’s tables.)

Device (EC) { ... // This part defines how to access the EC Name (_CRS, ResourceTemplate () // Current Resource Settings { IO (Decode16, 0x0062, 0x0062, 0x01, 0x01) // port 0x62 IO (Decode16, 0x0066, 0x0066, 0x01, 0x01) // port 0x66 }) ... // This part defines symbolic names for the EC registers Field (ECOR, ByteAcc, NoLock, Preserve) { ... Offset (0x38), HB0S, 7, // Battery 0 Status @ EC 0x38 bits 0..6 HB0A, 1, // Battery 0 Active @ EC 0x38 bit 7 HB1S, 7, // Battery 1 Status @ EC 0x39 bits 0..6 HB1A, 1, // Battery 1 Active @ EC 0x39 bit 7 ... HIID, 8, // Battery Page Select @ EC 0x81 ... SBVO, 16, // Battery Voltage @ EC 0xAA-0xAB ... } // Function containing common code for both batteries Method (GBST, 4, NotSerialized) { ... HIID = Arg0 // Write Battery Page Select Local3 = SBVO // Read Battery Voltage // More code here, eventually returning status ... Return ... } Device (BAT0) // Battery 0 { ... Method (_BST, 0, NotSerialized) // _BST: Battery Status { ... // Read Battery 0 Status (HB0S), call GBST function Return (GBST (0x00, HB0S, ...)) } } Device (BAT1) // Battery 1 { ... Method (_BST, 0, NotSerialized) // _BST: Battery Status { ... // Read Battery 1 Status (HB1S), call GBST function Return (GBST (0x10, HB1S, ...)) } } ... }

Ultimately, a BAT0._BST invocation will read and write EC registers such as Battery 0 Status (0x38) and Battery Voltage (0xAA and 0xAB), but note that these registers are implementation-specific, they are not defined in the ACPI standard.

(If you’re wondering, the language is here ACPI Source Language [ASL], which is compiled to ACPI Machine Language [AML] and executed at runtime by a bytecode interpreter in the operating system. If this seems very complicated, it is: in a past life, I worked with microkernels, and the size of the kernel jumped by an order of magnitude when we implemented ACPI. The ACPI abstraction is very powerful, however, freeing the operating system from having to know about platform implementation details.)

Now let us return to the process of updating firmware. There are two stages: first, the Lenovo update program is run (from Windows or DOS) and it writes the firmware update image to a special area, then the system reboots to BIOS and the BIOS applies the necessary updates (to BIOS and/or embedded controller firmware). So the actual embedded controller update is performed by BIOS.

The ThinkPad has a BIOS that is implemented according to the UEFI specification. A detailed description of UEFI would take a whole book, however briefly, it defines a modular BIOS which consists of many separate executable modules. For example, there may be a module to initialize the video card or a module to initialize the keyboard. These modules are in Portable Executable format, which is the same format used for Windows executables, although clearly the environment is far more primitive. The UEFI loader – the core of the boot process – calculates the dependencies between the modules and then runs each module in the most sensible dependency order. Each module takes a pointer to a EFI system table which contains pointers to all the system services provided by the UEFI infrastructure.

There are a number of tools around that can pull apart a UEFI ‘capsule’ (firmware update image). I used UEFITool.

The core of the EC update process is implemented in a BIOS module called EcFwUpdateDxe.efi. I mentioned above that an ACPI-compatible embedded controller exports two commands, 0x80 (read) and 0x81 (write). It turns out that early in the boot process the Lenovo EC exports an additional command, 0xaf (upload code). (Later in the boot process this command is irreversibly disabled until the next reboot: this is in fact a good thing for security as silent updating of the EC by viruses could be very dangerous. A similar security mechanism also applies for updating the main BIOS image.)

EcFwUpdateDxe.efi, invoked early in the boot process, uses EC command 0xaf to upload a small ‘flasher’ program to the embedded controller. The purpose of the flasher program is to accept commands from the host to erase and program the internal flash memory of the EC. Importantly, the flasher program runs from EC RAM so that the internal flash memory can be reprogrammed (normally, the EC runs directly from internal flash memory, which would make it impossible for it to reprogram itself).

The update process sends the EC firmware image to the flasher program in the original partially-encrypted form. The flasher program then presumably decrypts blocks before it programs them.

So surely we can work out the decryption algorithm by disassembling the flasher program, right? Ah, but there’s a catch. The flasher program itself is encrypted! And it is decrypted by a function on the EC which is itself encrypted, at least in the firmware update image. So temporarily we are at an impasse.

But remember that I hypothesised that the EC firmware is stored in flash memory in decrypted form. If we could read that flash memory via some side channel, we would presumably be able to extract the decryption function. As it turns out, the MEC1619 has a JTAG interface that can be used for programming and readback of its flash memory, as well as remote debugging of the ARC625D microprocessor. JTAG is a standard that is widely used by hardware folks for testing and programming chips; it consists of four to five pins: TCK (clock), TMS (mode select), TDI (data in), TDO (data out) and sometimes TRST (reset). (Technically JTAG is the name of the industry association and the interface is called a Test Access Port, but everyone I know calls the interface JTAG.)

Physically getting to the MEC1619 to access these pins is rather difficult and takes a fair bit of disassembly (I believe the MEC1619 is located under the ExpressCard slot). Happily, I did not need to do this: in Googling for MEC1619 JTAG I found a Russian forum where some dude had already connected up JTAG to his laptop’s MEC1619 embedded controller, and what’s more, posted the flash memory image on the forum. Sure enough, it was in decrypted form.

Now that I had a decrypted version of the EC code, including the decryption routine, it was only a matter of time before its secrets were revealed.

In part 3 we finally manage to solve the checksum puzzle and build a new embedded controller image… but then hit a few more challenges…

]]>