Splines in Onshape, part 1

Introduction

First can I say how much I love Onshape. I have no association with the company other than being a user, so you can trust that this is real, genuine puppy love. Onshape provides much of the functionality of parametric CAD systems such as SolidWorks in your browser, which at first sight seems like magic. Since I’m a Linux user and the native CAD landscape on Linux is quite limited, being able to do CAD in a browser works brilliantly for me. Even better, it’s free for non-commercial use.

My first little project in Onshape, several moons ago, was a vertical illuminator for my home-brew microscope. The problem I had was that when using high magnifications — which require the lens very close — I couldn’t get enough light in under the lens to illuminate opaque samples. A vertical illuminator is the solution to this problem, sending light down through the lens via a beamsplitter. I designed the physical parts in Onshape, exported to STL format and uploaded to 3DHubs; a local maker printed the parts and within a few days I had it back and working.

Scripting in Onshape

Onshape has a scripting language called FeatureScript. FeatureScript allows extending the Onshape interface with domain-specific tools (features as they’re known in Onshape). For example, if you are doing furniture design in Onshape, you can write features that produce panels and joints and then do high-level design in the user interface, rather than having to work with low-level primitives such as lines and boxes. Clearly this can be a huge boon to productivity. Here is a screenshot from one of the Onshape demonstration videos showing a custom box joint feature that, once created, can be used just like built-in features:

As a programmer and perfectionist I like the idea of writing code to simplify the CAD process rather than spending hours placing parts in a GUI. Prior to Onshape I’ve used OpenSCAD a few times. But for complex designs it gets difficult to mentally keep track of what’s where without having a live preview, and OpenSCAD rendering can get very slow once you start chamfering, filleting and making things pretty. This is where Onshape and FeatureScript shine.

Recently I’ve been doing a bit of jewellery design in Onshape, and this involves a lot of curves and curved surfaces. While I can approximately create the curves I want in the GUI, I wanted to do it programmatically in FeatureScript. Unfortunately a number of the functions that I needed to use were minimally documented so I had to do a lot of trial and error in the process. This article is an attempt to explain some of the missing links for those following in my footsteps.

Onshape is still being developed at a breakneck pace, and since I started writing this article there are now a number of new features related to curves including the option to directly create splines in 3D. I’ll start, though, from the ‘traditional’ way of creating curves in Onshape — by creating them in a 2D sketch and then lofting/extruding/etc. — and I’ll briefly mention 3D curves later in the piece.

Basics of splines

There’s a lot of mathematical jargon around splines and polynomials, so I’ll try to use both the technically correct terms and plain English where possible.

A spline is a piecewise polynomial. The curve is made up of one or more pieces, where each piece is a polynomial. The polynomials are normally chosen such that they “match up” at the transitions and you end up with something that looks like a single continuous curve. There can be various definitions of “matching up”, however to produce a visually smooth curve, at least the function values and the first derivative need to match (C1 continuity), and usually the second derivative is also chosen to match (C2 continuity).

In the case of Onshape splines used in sketches, the pieces are normally cubic polynomials — polynomials of maximum degree 3 — in two dimensions x and y. However, I should be clearer what I mean by that, as there could be at least three different things that could be meant by cubic polynomials in two dimensions:

Explicit form: y=f(x):
y = ax3 + bx2 + cx + d
(defined uniquely by 4 free constants: a,b,c,d)

Parametric form: x=f(t), y=g(t) for some parameter t:
x = at3 + bt2 + ct + d
y = et3 + ft2 + gt + h
(defined uniquely by 8 free constants: a,b,c,d,e,f,g,h)

Implicit form: f(x,y)=0:
ax3 + bx2y + cxy2 + dy3 + ex2 + fxy + gy2 + hx + ky + l = 0
(defined uniquely by 10 free constants: a,b,c,d,e,f,g,h,k,l)

The explicit form can produce, say, a parabola y=x2, but can’t produce a parabola rotated by 90 degrees. Therefore it makes little sense for software such as Onshape which must be able to produce curves in any orientation.

The implicit form is the most powerful, in that it can express curves that can’t be expressed in the other two forms, but it is difficult to evaluate the set of points that are part of the curve.

Therefore, as might be expected, Onshape curves are of the parametric type: a curve in two dimensions is produced parametrically by evaluating functions x(t) and y(t) for t from 0 to 1. The functions x(t) and y(t) are splines: there may, for example, be one polynomial piece from t=0 to t=0.5 and one from t=0.5 to t=1. Some of the FeatureScript functions related to curves, such as evEdgeTangentLine() and evEdgeCurvature(), take this parameter t as an argument.

Drawing 2D splines – single polynomial piece

Splines can be created programmatically in an Onshape sketch using the skFitSpline() function. First let’s start with a curve with only one polynomial piece between two points (0,0) and (100,100). In FeatureScript we can write:

skFitSpline(sketch, "spline1", {
        "points" : [
            vector(0,0) * meter,
            vector(100,100) * meter
        ],
        "startDerivative" : vector(150,0) * meter,
        "endDerivative" : vector(0,150) * meter
});

Here is the result:

Note that I’ve specified a startDerivative and endDerivative that specify the starting and ending direction; if I had only specified a start and end point then the result would have just been a straight line.

When I was first experimenting with this, my first question was: why do the start and end derivatives have length units (in this case, 150 meters)? With a bit of experimentation it’s clear that these these vectors should be scaled together with the curve, i.e. if we scale the curve up by a factor of two we should also double the startDerivative and endDerivative. Also, a larger magnitude makes the curve launch with more momentum in the given direction; for example, here is the curve with startDerivative increased to (500 meters,0):

But what does the magnitude of these vectors actually mean?

It all becomes clearer, however, when you consider the curves in parametric form as described in the previous section: x=f(t) and y=f(t). It turns out that the given derivative vectors are the derivatives of (x(t), y(t)) with respect to the parameter t (i.e. (dx/dt, dy/dt)) at t=0 and t=1. If this is too abstract to visualize, there is also a simple mapping to Bézier control points that I’ll explain below.

It turns out that the start point and end point, and the two derivative vectors – four (x,y) pairs in total – uniquely define the eight parameters of the parametric cubic polynomial. Thus, any parametric cubic polynomial can be specified in this way.

If you don’t specify the derivative at the start point or end point, the second derivative at that point is set to zero, which is sufficient to provide the missing constraint on the polynomial.

Drawing Bézier curves

If you are familiar with Bézier curves the above discussion will sound familiar. A Bézier curve is defined by four control points (call them P1, P2, P3, P4). The curve launches from P1 in the direction of P2, and then approaches P4 from the direction of P3. If P2 is further away from P1, then the curve launches with more momentum in the given direction.

It turns out that these formulations are equivalent with a tiny amount of maths, and if you have the Bézier control points, you can calculate the required startDerivative and endDerivative as:

(1)   \begin{align*} \left(\frac{dx}{dt},\frac{dy}{dt}\right)_{start} &= \frac{3}{\Delta t} \left(P_2-P_1\right)\\ \left(\frac{dx}{dt},\frac{dy}{dt}\right)_{end} &= \frac{3}{\Delta t} \left(P_4-P_3\right) \end{align*}

where Δt will be 1 for the simple case that t varies from 0 to 1 across the curve segment, and the 3 arises from the degree of the polynomial (because d/dt(t3) = 3t2). Thus, for a single-piece spline, the startDerivative and endDerivative will be three times the distance to the corresponding Bézier control point.

Of course you can also calculate the Bézier control points from the startDerivative and endDerivative by rearranging these equations for P1 and P2:

(2)   \begin{align*} P_2 &= P_1 + \frac{\Delta t}{3} \left(\frac{dx}{dt},\frac{dy}{dt}\right)_{start} \\ P_3 &= P_4 - \frac{\Delta t}{3} \left(\frac{dx}{dt},\frac{dy}{dt}\right)_{end} \end{align*}

Evaluating splines in Onshape

You can evaluate the spline at any point using the evEdgeTangentLine() function in FeatureScript, providing the parameter value t (from 0 to 1). The Line object that is returned has an origin that provides the evaluated point and has a direction that is tangent to the spline.

var curve = qCreatedBy(id + "sketch", EntityType.EDGE);
var tangent = evEdgeTangentLine(context, {
        "edge": curve,
        "parameter": 0.5,
        "arcLengthParameterization": false
});
println(tangent.origin);
println(tangent.direction);
Debug pane output:
(68.75 meter, 31.25 meter, 0 meter)
(0.7071067811865475, 0.7071067811865475, 0)

This can be useful for performing further geometry calculations after drawing a spline. For completeness I’ll note that there is also a evEdgeTangentLines() function — which is identical but allows evaluating the curve at multiple points in one call — and an evEdgeCurvature() function — which returns not only a tangent but also normal and binormal vectors.

Drawing 2D splines – multiple polynomial pieces

Now let’s add another point to the spline:

skFitSpline(sketch, "spline1", {
        "points" : [
            vector(0,0) * meter,
            vector(10,10) * meter,
            vector(100,100) * meter
        ],
        "startDerivative" : vector(150,0) * meter,
        "endDerivative" : vector(0,150) * meter
});

Here is the result:

Now there are two polynomial pieces, one that goes from (0,0) at t=0 to (10,10) at t=0.25, and one that goes from (10,10) at t=0.25 to (100,100) at t=1. The breakpoint between the two – called a knot – is at t=0.25.

Why is the knot at t=0.25? Well, this knot could actually be placed anywhere in ‘t space’, for instance t=0.1 or t=0.5, but as the second piece of the curve is much longer there is an argument for assigning more ‘t space’ to the second piece. Onshape uses the square root of the chord length between points as the metric; the lengths of the two chords here are 14.1m and 127.3m so the knot location is chosen as sqrt(14.1m)/(sqrt(14.1m)+sqrt(127.3m)) = 0.25.

The polynomial pieces are chosen such that both the first and second derivative are continuous through the middle point, which uniquely defines the two polynomials. The resulting curve ends up being the same as what would be produced by the following two skFitSpline calls:

skFitSpline(sketch, "spline1.piece1", {
        "points" : [
            vector(0,0) * meter,
            vector(10,10) * meter
        ],
        "startDerivative" : vector(37.5,0) * meter,
        "endDerivative" : vector(8.4375,17.8125) * meter
});
skFitSpline(sketch, "spline1.piece2", {
        "points" : [
            vector(10,10) * meter,
            vector(100,100) * meter
        ],
        "startDerivative" : vector(25.312,53.438) * meter,
        "endDerivative" : vector(0,112.5) * meter
});

… where I’ve calculated the required piece1 endDerivative and piece2 startDerivative so that both first and second derivatives match at the joining point (10,10). Note that all of the derivatives end up scaled when splitting the curve, as they are derivatives with respect to t and t is no longer the same variable (now each piece has t from 0 to 1).

As a corollary, if you only care about first derivative continuity and not second derivative continuity, then you can actually get a larger variety of curves by drawing the two parts individually: then you can choose any value for the piece1 endDerivative and piece2 startDerivative as long as the direction matches.

B-splines

Actually Onshape represents all of these splines in terms of B-splines rather than as an array of polynomial co-efficients a,b,c,d,e,f,g,h for each piece. Contrary to popular belief, there is nothing voodoo about B-splines, they are just a tool for expressing splines in a more convenient form.

The basic idea is that we can separate a spline function into a set of control points c, which define an approximate path for the spline, and a set of functions b(t), which give the weighting of each control point at a parameter value t along the curve. The spline may not pass through all or even any of the control points c, but moving the control points allows you to control the path of the curve.

For example, the control points for the spline that we generated in the previous section are are depicted below (I’ll explain how these are determined from the skFitSpline() parameters in a moment):

If we think of c as a column vector of control points and b(t) as a row vector, then at any point we can evaluate the spline function as:

(3)   \begin{equation*} \mathbf{s}(\mathrm{t}) = \mathbf{b}(\mathrm{t}).\mathbf{c} \end{equation*}

b(t), which weights the control points, is chosen such that it has a number of nice properties. The sum of components in b(t) should always be 1 for any t, which is required for scale-invariance (i.e. if you scale up the control polygon c, the spline should scale with it). Also, b(t) should be smooth everywhere (i.e. continuous first and/or second derivatives), such that s(t) is smooth everywhere.

A suitable function b(t) can be generated using what’s called the Cox-de Boor recursion formula. The only input to this algorithm is an ordered list of t values called knots – these are transition points between a set of underlying polynomials used to construct b(t).

To make it easy to experiment with B-splines, I wrote a simple Octave/MATLAB function that generates b(t) for a given polynomial degree and a given set of knots. The code is here: bsplinebasis.m. For example, for the previous curve with a knot at 0.25, we can evaluate:

octave:1> degree=3;
octave:2> knots=[0,0,0,0,0.25,1,1,1,1];
octave:3> bsplinebasis(degree, knots, 0.5)
ans =

   0.00000   0.16667   0.44444   0.35185   0.03704

In other words, at t=0.5, the five control points of the spline are weighted according to this vector. Notice that here (and in fact anywhere beyond the first knot at t=0.25) the first control point has no effect; this is a B-spline property called local support.

I’ve failed to explain something. It makes sense that the knot locations are at t=0,0.25,1, which corresponds to the pieces of the spline, but notice that I’ve repeated the t=0 and t=1 knots four times on line 2. This is called knot multiplicity. In order for the spline to be “clamped” to the first and last control points, these knots must be repeated at least n+1 times (where n is the degree of the polynomial). A picture that illustrates this can be found on this website.

We can now draw the spline in Octave and confirm that it looks the same as in Onshape, and indeed it does:

octave:4> t=[0:0.01:1];
octave:5> cpts=[0+0i; 12.5+0i; -8.75+16.25i; 100+62.5i; 100+100i];
octave:6> plot(bsplinebasis(degree,knots,t)*cpts,'-',cpts,'--.')

In the above code I’ve expressed the control points as complex numbers x+iy as a convenience. I could also have done, equivalently:

octave:7> xcpts=[0; 12.5; -8.75; 100; 100];
octave:8> ycpts=[0; 0; 16.25; 62.5; 100];
octave:9> plot(bsplinebasis(degree,knots,t)*cxpts,
                   bsplinebasis(degree,knots,t)*cypts)

Note that the control points are not the same as the points you specify to skFitSpline(). In this example, the specified points to skFitSpline() were (0,0), (10,10), (100,100) with startDerivative of (150,0) and endDerivative of (0,150), while the corresponding spline control points are (0,0), (12.5,0), (-8.75,16.25), (100,62.5), (100,100). How did I arrive at these control points? The first and last fit points become the first and last control points. The second and second-last control points can calculated from the startDerivative and endDerivative using equation set 1. The other control points are derived from the remaining fit points with linear algebra, using equation 3.

You can create a a spline with a given set of control points and knots directly using skSplineSegment() as follows:

// Not recommended, not a public API
skSplineSegment(sketch, "spline2", {});
var guess = {"spline2": [
               0, // isClosed
               0, // isRational
               3, // degree
               5, // numControlPoints
               0,0,12.5,0,-8.75,16.25,100,62.5,100,100, // pts
               0,0,0,0,0.25,1,1,1,1, // knots
               0, // lowParameter
               1  // highParameter
               ]};
  skSetInitialGuess(sketch, guess);
  skSolve(sketch);

Note however that however this is not a public API and subject to change, so it’s better to use skFitSpline() where possible. If you want to define a spline using control points, you can easily convert it to skFitSpline() form by evaluating it at all of its knots, and calculating the startDerivative and endDerivative using equation set 2.

(For completeness, there are actually at least four internal functions that produce splines: skSpline(), skSplineSegment(), skInterpolatedSpline() and skInterpolatedSplineSegment(). The ‘Interpolated’ versions take a set of points the curve passes through plus the start and end derivatives, while the non-interpolated versions take control points. The ‘Segment’ functions produce an open spline with endpoints, whereas the non-segment functions can be used for closed splines without endpoints. skFitSpline() is a more user-friendly version of skInterpolatedSplineSegment().)

Why B-splines?

It can be seen that the B-spline form of a spline has a number of advantages over specifying polynomial co-efficients for each polynomial piece. Firstly the control points have a more intuitive geometric interpretation than the polynomial co-efficients. Secondly, if b(t) is chosen to have second-derivative continuity, then s(t) automatically has second-derivative continuity without needing to fit a set of co-efficients that satisfy this criterion. However there is nothing magic about B-splines: you can calculate the polynomial co-efficients from the B-spline control points, and vice versa, so they are just different representations of the same curves.

Note also that for a single-piece spline with a simple knot vector like [0,0,0,0,1,1,1,1], b(t) is exactly equivalent to the weighting that defines a Bézier curve, and the resulting spline is the same as the Bézier curve with the same control points. More complicated splines can be decomposed into multiple such Bézier pieces. So again there is nothing more or less powerful about these splines versus a set of Bézier curves joined together; however the B-spline representation requires less control points for the default case that second-derivative continuity is desired.

In fact I’ll admit to a little sleight-of-hand when preparing this article, where I’ve used the properties of B-splines to make my life easier. In one of the sections above, I split the curve created by skFitSpline() into its two equivalent pieces. The first time I did this, I did it from first principles — writing down the curves in polynomial form and setting the second derivatives equal at the joining point — and it took me a few pages of working. When I needed to redo it for the example spline in this article, though, I used an easier method which is to split the B-spline using a knot insertion algorithm. This is a method for inserting extra knots (and calculating updated control points) while preserving the same curve. By inserting two extra knots at t=0.25, the control polygon can be made to pass through the point (10,10) and we now have two explicit Bézier pieces. Here is the process I used in Octave:

octave:10> cpts,knots
cpts =

     0.00000 +   0.00000i
    12.50000 +   0.00000i
    -8.75000 +  16.25000i
   100.00000 +  62.50000i
   100.00000 + 100.00000i

knots =

   0.00000   0.00000   0.00000   0.00000   0.25000   1.00000
   1.00000   1.00000   1.00000

octave:11> [cpts,knots]=bsplineinsert(degree,cpts,knots,0.25)
cpts =

     0.00000 +   0.00000i
    12.50000 +   0.00000i
     7.18750 +   4.06250i
    18.43750 +  27.81250i
   100.00000 +  62.50000i
   100.00000 + 100.00000i

knots =

   0.00000   0.00000   0.00000   0.00000   0.25000   0.25000
   1.00000   1.00000   1.00000   1.00000

octave:12> [cpts,knots]=bsplineinsert(degree,cpts,knots,0.25)
cpts =

     0.00000 +   0.00000i
    12.50000 +   0.00000i
     7.18750 +   4.06250i
    10.00000 +  10.00000i
    18.43750 +  27.81250i
   100.00000 +  62.50000i
   100.00000 + 100.00000i

knots =

   0.00000   0.00000   0.00000   0.00000   0.25000   0.25000
   0.25000   1.00000   1.00000   1.00000   1.00000

octave:13> plot(bsplineeval(degree,cpts,knots,t),'-',cpts,'--.')

Now control points 1 to 4 and control points 4 to 7 define two Bézier curves, and we can compute the required startDerivative and endDerivative values:

octave:13> startDerivative1 = (cpts(2)-cpts(1))*3
startDerivative1 =  37.500
octave:14> endDerivative1 = (cpts(4)-cpts(3))*3
endDerivative1 =   8.4375 + 17.8125i
octave:15> startDerivative2 = (cpts(5)-cpts(4))*3
startDerivative2 =  25.312 + 53.438i
octave:16> endDerivative2 = (cpts(7)-cpts(6))*3
endDerivative2 =    0.00000 + 112.50000i

I’ve uploaded the code for bsplineinsert.m to my GitHub along with some other B-spline utility functions.

Taking a break

Hopefully in this instalment you’ve learnt a few things about splines and how they can be represented in Onshape… I know I have. In Part 2, coming soon, I’ll discuss offset splines, derivatives of splines, creating splines directly in 3D, and some other tidbits about Onshape internals.

Posted in Computing | 2 Comments

Dell Venue 11 Pro travel keyboard troubleshooting

Introduction

I recently purchased a pair of Dell Venue 11 Pro 7140 tablet computers — one for myself and one for my girlfriend. I figured this would be a good crossover device between a tablet and a laptop, and so far I’m not disappointed. One important reason I chose this model is because they are more easily serviceable than other comparable offerings like the Microsoft Surface Pro line; the Surface Pro is glued together in a way that is difficult to disassemble and reassemble, which makes me very unhappy. Batteries have a limited service life (as I know all too well from my last post), and I don’t want to throw away a perfectly good tablet in a couple of years just because the battery is not holding a charge. Even if you don’t have any interest in taking apart your own devices, I would encourage you to consider the fixability of your device and vote with your wallet against throw-away devices. Anyway, enough of that rant.

Together with my Dell tablet I bought a keyboard. There’s two types of keyboard available for the Dell Venue 11 Pro: a “slim tablet” keyboard (K11A), which is just a keyboard and touchpad, and a “mobile” or “travel” keyboard (K12A), which additionally has a battery in it. I chose the latter version with the battery inside so that I could get more battery life out of my tablet.

I ran into an interesting problem with the travel keyboard, or at least my unit. Sometimes after attaching it to the tablet, the touchpad didn’t work. Sometimes the whole keyboard wouldn’t work. Strangely though, if I disconnected the battery inside the keyboard or let it drain completely, it would always work reliably, so I knew that it was some interaction with the voltage from the battery.

Most people at this point would call Dell, and unless you’re an electronics geek this is probably the best option. But I thought I would try to track it down.

Initial probing

The first thing I did after removing the cover was to probe the 6-pin touchpad connector, since that was the most obvious point of brokenness. The pinout for this connector looks approximately like this:

Pin Description
1 I/O (PS/2 clock?)
2 ground
3 I/O (PS/2 data?)
4 connected to 3.3V via 0 ohm jumper
5 I/O (touchpad button?)
6 3.3V

In normal working operation, five of the six pins — all except ground — probed as 3.3V, which is the normal idle state of the PS/2 bus. In the broken state, pin 3 was stuck low (0V). Essentially the touchpad PS/2 bus was in a locked up state. Note that the button may still work in this state (as I recall someone else on the Internet noting).

Most interesting was what happened when the keyboard was disconnected from the tablet, leaving it powered only by its internal battery. Now all the 3.3V pins dropped to 2V. This was likely the cause of the PS/2 bus lockups: the components running off that voltage supply (touchpad, keyboard, and associated microcontroller) were unlikely to work reliably at 2V. When the tablet was plugged back in, the voltage rose back to 3.3V, but the microcontroller and/or touchpad were sometimes in a bad state from their sojourn at 2V.

I initially thought that that the 2V might be due to a “back powering” issue (“back powering” occurs when an unpowered IC has voltage applied to some I/O pins, and current flows from those I/O pins into the power rail). However, I couldn’t find any ICs that were connected in a way that would cause back-powering. Rather it seemed that there was some issue with the circuit that turns the power on and off.

Power control circuit

The circuit that controls power to the keyboard/touchpad looks like this:

This type of circuit is what’s known as a “high-side switch”. The element doing the actual switching is Q2, a P-channel MOSFET transistor, and Q1 is an N-channel MOSFET acting (together with R2) as an inverter.

When the keyboard is attached to the tablet, ENABLE (USB_5V) is 5V → transistor Q1 is on → node X is pulled to ground → Q2 turns on fully → OUT is 3.3V. This was working as expected in my keyboard.

When the keyboard is disconnected, USB_5V is floating and pulled down to ground by R1 → Q1 is off → node X should be pulled up to 7V by R2 → Q2 should turn off fully → OUT should float to 0V assuming nothing else is powering it. This was not working correctly in my unit. Instead, node X was at 2.7V and OUT at 2.0V.

(N.B. the “7V” I refer to may not be exactly 7V; it’s the battery voltage minus a bit so may be higher or lower depending on how much charge is in the battery.)

Tracking down the problem

My first thought was that the 100K pull-up resistor might be too high, and the leakage current through the three transistors was dragging down node X from 7V to 2.7V (it only takes 40 μA to drop 4V across 100K, and in my world 40 μA is tiny).

I removed R1 from the board and replaced it with a 65K resistor instead, thinking that would solve the problem. To my great surprise node X remained around 2.7V.

I then tried temporarily connecting node X to 3.3V with a 10K resistor to make the pull-up even stronger, and it still barely moved above 2.7V.

The value 2.7V is about a diode drop below 3.3V which I thought might be a clue. To determine whether this was relevant, I tried an experiment with the 3.3V rail off, connecting node X to 2V from a current-limited power supply. Now both the 3.3V rail and the OUT rail were hovering at 1.2V, and node X was sinking about 1.2mA. Where was this current going? Q1 should have been off, and Q2 and Q3 are oriented with their gates connected to node X so should not be sinking any current from it (the gate of a MOSFET is insulated from the other terminals so the gate current should be essentially zero).

I considered three options:

  1. Q2 was not a P-channel MOSFET but something else. (I couldn’t determine the part number for any of Q1/Q2/Q3; part markings on these tiny SMD parts often have no correlation with the part numbers, so working backwards from markings is difficult unless you happen to stumble on the right datasheet.)
  2. There was something else connected to node X that I hadn’t found in my tracing of the board.
  3. One of the transistors was broken and leaking current.

Testing components in-situ is tricky as multimeter readings are affected by anything else connected to the nodes, for example node X has a capacitor that will sink current until charged up. After a day of head-scratching and verifying my schematic, I decided that (c) was most likely and I pulled Q2 off the circuit board with a dose of hot air.

It was immediately apparent that Q2 was broken as its gate-to-source and gate-to-drain resistances were only a few kiloohms. In a MOSFET, the gate is behind an insulator, so these resistances should be megaohms. The low resistance from gate-to-source and gate-to-drain suggests that the gate oxide in this MOSFET is damaged. (Interestingly, the resistance is different in the gate-to-source direction vs the source-to-gate direction, and similarly for gate-to-drain vs drain-to-gate. This makes sense if you consider that a MOSFET, under the oxide, contains two p-n junctions like a BJT transistor.)

After replacing Q2 with a new P-channel MOSFET, the output voltage now turns off fully, and my keyboard and touchpad work normally.

Final thoughts

One thing I learnt from this debugging process is that MOSFETs can fail in interesting ways: not just ‘on’ or ‘off’, but with current leaks that behave partly like resistors and partly like p-n junctions, reflecting the internal construction of the device.

As to why this device failed, I’m still not sure. My best guess from the marking (N1D3C) is that it’s an Si2301CDS which, if correct, seems perfectly up to the task. Most MOSFET power switches fail due to thermal stresses, but the current delivered by this circuit is very low compared to what this transistor can handle. I also doubt that there are any transient voltage spikes as both input and output have capacitors. My only thought is that the pull-up to around 7V (VGS(off) around +4V) may contribute to stress on the gate oxide over time, compared to if it was driven more conservatively with 3.3V (VGS(off) = 0V). The purpose of driving the gate to +4V when off is likely to reduce off-state power consumption, by “over pinching” the channel to reduce leakage. However it may also increase the likelihood of charge carriers migrating into the gate oxide. The datasheet specifies VGS = +8V as the “absolute maximum” beyond which permanent damage may occur, however there will also be a threshold beyond which device reliability is affected, which is unfortunately not stated in the datasheet. It may be that keeping the device for extended periods at VGS around +4V — as will be the case whenever the keyboard is off — is not a good idea. It’s clear that the designer wasn’t sure what the right answer was, either, since they’ve included pads for pulling up to both 3.3V and to 7V. Presumably a lot of people have this keyboard and it works just fine, so there is some element of silicon defects or bad luck as well.

Anecdotally it seems that a few other people on the Internet have had similar issues to mine, although I’m not sure whether they all stem from the same root cause. If you have this exact same behaviour (your Venue 11 Pro touchpad doesn’t work when the battery is charged but works reliably when battery is flat), and if you’re an electronics geek and you want to open your keyboard, I’d love to know whether your voltages at X and OUT have a similar anomaly to mine. Of course, if you can get it fixed under warranty then you should probably do that before risking voiding your warranty 🙂

Posted in Computing | Leave a comment

Unlocking my Lenovo laptop, part 3

The decryption function

If you are just joining this story you may want to start at part 1.

In part 2, we discovered that a embedded controller update is performed by uploading a small ‘flasher’ program to the EC. This flasher program is then responsible for programming a new firmware image to the EC’s internal flash memory. However, both the flasher program and part of the firmware image are encrypted: the old (currently running) EC firmware decrypts the flasher program, and the flasher program then decrypts the new firmware update. This creates a bit of a chicken-and-egg problem that prevents discovering the encryption algorithm from firmware update files alone.

We managed, however, to find a decrypted version of the EC firmware online, dumped directly from EC flash memory via JTAG. Let’s dive right in and disassemble the decryption function that can be found in that flash image. The core of it looks like this:

2854:  30 25 8d 1f 00 00 48 14      ld         r13,[r13,0x1448]
285c:  30 21 81 0f 00 00 48 10      ld         r1,[r1,0x1048]
2864:  cf 7e                        extb_s     r14,r14
2866:  02 be                        asl_s      r14,r14,2
2868:  b9 61                        add_s      r1,r1,r13
286a:  30 26 8d 1f 00 00 48 18      ld         r13,[r14,0x1848]
2872:  4f 7f                        extb_s     r15,r2
2874:  02 bf                        asl_s      r15,r15,2
2876:  a7 79                        xor_s      r1,r1,r13
2878:  30 27 8d 1f 00 00 48 1c      ld         r13,[r15,0x1c48]
2880:  b9 61                        add_s      r1,r1,r13

Here each input byte is transformed through a lookup table (in cryptography terminology, a substitution box or ‘S-box’) and the results are combined with an add/xor/add structure. This is the Blowfish cipher, as becomes evident from one glance at the diagram in the Wikipedia article on Blowfish:

Blowfish F function

[Diagram by Decrypt3/DnetSvg via Wikimedia, CC BY-SA 3.0]

Now normally the first stage of Blowfish, like most ciphers, would be to expand a user-provided key – a password or passphrase or some other secret data – to produce a set of 18 round keys (called the ‘P array’ in Blowfish terminology) and the four 256-entry S boxes depicted above. In cryptography this expansion step is called a key schedule.

In the case of the Lenovo firmware, it turns out that the keys are stored in pre-expanded form, i.e. the values of the P array and S boxes are stored in flash memory rather than the original key string. We can extract the P array and S boxes from the dumped flash image and use them for encryption/decryption.

(I do not believe there is any easy way – where in cryptography easy means ‘significantly better than trying all possibilities’ – to recover the original key string that was used to generate the P array and S boxes. This would be an interesting challenge, but is of purely academic interest: the P array and S boxes are all that is needed for both encryption and decryption. Each round of Blowfish involves taking the data, mixing in [exclusive or] one of the round keys from P and then performing the function depicted above using the S boxes; this is repeated 16 times; apart from some minor details you now understand how Blowfish works.)

The checksum function

Having the encryption algorithm and keys, we can now decrypt the flasher program that is uploaded to the embedded controller when performing a firmware update.

Analysis of the flasher program allows us to determine the checksum algorithm used to validate the firmware image. Even without a detailed analysis of the disassembly, we can notice that it uses the following table:

uint16_t csum_table[256] = {
 0x0000, 0x1021, 0x2042, 0x3063, 0x4084, 0x50A5, 0x60C6, 0x70E7,
 0x8108, 0x9129, 0xA14A, 0xB16B, 0xC18C, 0xD1AD, 0xE1CE, 0xF1EF,
 0x1231, 0x0210, 0x3273, 0x2252, 0x52B5, 0x4294, 0x72F7, 0x62D6,
 0x9339, 0x8318, 0xB37B, 0xA35A, 0xD3BD, 0xC39C, 0xF3FF, 0xE3DE,
 ...
};

A quick Google for these numbers gives this secret away easily: this is a lookup table for a CRC-16 function with a generator polynomial of 0x1021. At this point we can be pretty sure that the algorithm is CRC-16, with only a few details to determine like initialisation value, but let me indulge myself and make a brief digression into what a CRC is and why the table looks like this. You can skip to the next section when this gets too dense for your liking.

The first thing to note is that when people refer to a CRC-16 generator polynomial of 0x1021, what they actually mean is 0x11021, i.e. binary 10001000000100001, i.e. x^16 + x^12 + x^5 + 1. (The x^n term of a CRC-n generator polynomial is always assumed to be 1, otherwise the following algorithm would not produce an n-bit CRC.)

Now the best way to imagine calculating a CRC is, for every 1 bit that is set in the input message, you XOR in the polynomial (towards the right, if working left to right). This clears that bit and flips some subsequent bits: for the above polynomial, it flips the three bits at N+4,N+11,N+16. If those bits end up as 1 (when you come to them), they will in turn propagate to other bits. Once you reach the end of the message, you’ve cleared all the bits in the message, but you have some bits hanging off the end: these are the CRC. You can intuitively see from this that the CRC has error propagating properties; an erroneous bit will likely cascade to many other bits. (Engineers might be reminded of a linear feedback shift register which has many similarities. The CRC can also be expressed mathematically as the remainder of a long division in carry-less binary arithmetic [GF(2)], the similarity to long division will become apparent in the examples below.)

As an example, if the input message to our CRC is one byte with value 5 (00000101), then the CRC calculation would proceed like this, where I have highlighted in red the leading 1 at each step:

        (5)  00000101|
              xor 100|01000000100001 (0x11021)
             -----------------------
             00000001|01000000100001
                xor 1|0001000000100001 (0x11021)
             -------------------------
             00000000|0101000010100101 (0x50a5)

Here is another example for an input byte with value 16 (00010000):

       (16)  00010000|
            xor 10001|000000100001 (0x11021)
             ---------------------
             00000001|000000100001
                xor 1|0001000000100001 (0x11021)
             -------------------------
             00000000|0001001000110001 (0x1231)

For larger messages doing this bit by bit would be slow, of course. To do it more efficiently we can pre-calculate the effect of a byte on the next sixteen bits (this will be some pattern of bit flips), for each of the 256 possible values of that byte. We then look up the first byte, XOR that in to the next sixteen bits, move to the next byte and repeat. Actually the “effect of the first byte on the next sixteen bits” is exactly equivalent to the CRC of that byte – the CRC is the pattern of bit flips that ends up past the end of a message – so the lookup table is in effect a table of CRCs of single bytes.

Looking at the above examples you can verify that the CRC of 5 is indeed the entry with index 5 in the table above, and the CRC of 16 is indeed the entry with index 16. If you stare at the first example for a moment, you might see why the first sixteen entries of the table are multiples of 0x1021: all the 1 bits in 0x11021 are far enough apart that the input bits propagate independently into the CRC. Things change at entry 16 because the next 1 in the polynomial crosses over into the message.

Now, back to your regularly scheduled programming

Returning to the firmware, we can verify that the flasher checksum is in fact CRC-16. It’s important to note that the flasher program calculates this checksum after decrypting the firmware image, so the CRC-16 must be calculated on the decrypted version of the image. (It turns out that the firmware image is encrypted using the same key as the flasher program, so at this point we already have everything we need to know.)

I mentioned in part 2 that a simple 16-bit checksum is also present at the very end of the firmware image. In fact, there is also a third type of checksum that we discover when we disassemble the EC boot code: four 32-bit checksums, each performed on a different area of the flash memory.

Summarising now, the three EC checksums that must be correct are:

  • The outer checksum: BIOS checks the checksum of the (encrypted) image before uploading it to the flasher. The last two bytes of the image are calculated such that the total sum of the 16-bit words of the image is zero. If this fails, flashing doesn’t proceed, so this is the least dangerous checksum to fail.
  • The flasher checksum: The flasher calculates the CRC-16 checksum of all the decrypted bytes except the last four bytes. The penultimate two bytes of the image contain this checksum. If this checksum fails, the flasher sets a flag in flash that causes the EC firmware to fail to boot.
  • The boot checksums: The EC firmware, at boot, calculates 32-bit checksums of four different areas of the flash. If any of these fail the EC firmware fails to boot. These checksums must, of course, also be calculated on the decrypted image.

Building a modified firmware

We now have everything we need to know to successfully modify the embedded controller firmware. Returning to my original goal, here is the change I made to the battery authentication check (before/after, with some comments added):

  ; call validation function on battery response
  1d168:   ee 0f af f2         bl.d       0x2954 ; validate
  1d16c:   f8 16 02 11         ldw        r2,[r14,248]
- ; branch to failure path if return value not equal to 1
- 1d170:   0b 08 51 00         brne       r0,1,0x1d17a
+ ; branch replaced with no-operation instruction
+ 1d170:   4a 26 00 70         nop        

  ; success path - set bit 3 in battery status
  1d174:   00 86               ld_s       r0,[r14,0]
  1d176:   83 b8               bset_s     r0,r0,3
  1d178:   24 f0               b_s        0x1d1c0

  ; failure path - try a different validation function,
  ;                else retry up to 8 times, else abort
  1d17a:   10 10 80 20         ldb        r0,[r16,16]
  ...

Previously, if the return value from the validation function was not equal to 1, the code would branch to a failure path. I replaced this conditional branch instruction with a no-operation instruction so that, regardless of the validation result, execution would fall through to the success path. (This technique of replacing jumps with nops is a common trick when you need to modify the behaviour of binary code.)

(At this point my focus was on the first authentication sequence – state 12 in the firmware state machine that I listed in part 2.)

Also, for fun and so I could track my modified firmware version, I changed the embedded firmware version from “GCHT25WW” to “GCHT25MC”. I thought putting my initials in the geographic region field would be appropriate: the original suffix, WW, presumably means worldwide, while this firmware had… somewhat more limited geographic reach.

Having made my changes, I wrote some small utilities to regenerate the three types of checksums – I have published these utilities on GitHub – and finally I re-inserted the modified embedded controller firmware into the BIOS update .FL2 file at offset 0x500000.

If you are attempting this, make sure you check and double-check that you’ve modified exactly what you intended to modify. For Linux users, process substitution can be handy here, e.g. to get a binary diff of two files you can do: diff -u <(xxd file1.bin) <(xxd file2.bin).

The moment of truth?

I ran the BIOS update program and was greeted with:

An update is not necessary at this time. The process has been canceled.

Thwarted. I was already running the latest BIOS version and the update program did not offer an option to force an update of the BIOS or EC. Downgrading and upgrading might usually be a workaround, but I wasn’t sure that this would be possible as the Lenovo release notes mention that, “if the UEFI BIOS has been updated to version 2.63 or higher, it is no longer able to roll back to the version before 2.63 for security improvement”.

Fortunately, it turns out it’s possible to run the update program manually, bypassing the limited user interface.

Manually updating the EC

A note: As I do not have Windows on this laptop, I am running the BIOS update program from a USB flash drive following some instructions I found online (in short: download the bootable CD image; run geteltorito -o bios.img gcuj23us.iso; write the bios.img file to the USB stick). I suspect the below process is even easier from Windows, where you can directly run winflash32.exe or winflash64.exe in place of dosflash.exe.

The Lenovo BIOS update CD is just a DOS boot disk at heart. If you’ve written it to rewritable media like a USB flash drive, you can edit autoexec.bat to stop it starting the user interface, in which case it will drop you to a DOS prompt.

Updating the EC firmware can be performed using dosflash.exe as follows (/sd is skip date check to allow downgrading, /ipf ec targets the EC area):

C:\FLASH> dosflash /sd /ipf ec /file GCETA3WW\$01DA000.FL2
SCT Flash Utility for Lenovo
 for Shell V1.0.1.3
Copyright (c) 2011-2012 Phoenix Technologies Ltd.
Copyright (C) 2011-2012 Lenovo Group Limited.

Read BIOS image from file.
Initialize Flash module.
Read current BIOS.
Oem check

Prepare to flash “ec”

Do not turn off computer during the update!!!

Begin Flashing……
Total blocks of the image = 48.
|—+—-+—-+—-+—-+—-+—-+—-+—-+—-|
…………………………………………..
Image flashing done.

Flashing finished.

BIOS is updated successfully.

WARNING: System will shutdown or reboot in 5 seconds!

A note here on the FL1 and FL2 files since I couldn’t find any explanation about this on the Internet: the FL1 file is a UEFI capsule file that contains the BIOS. The FL2 file contains embedded controller firmware at offset 0x500000-0x530000, the rest of it can be ignored. Why then is the FL2 file so large and why does it contain bits of an old BIOS version pasted after the EC firmware? I think partly it may be to appease the non-Lenovo-specific parts of dosflash.exe. I noticed that even though ultimately it only uses the 48 4KB blocks from 0x500000-0x530000, if I pass a file that ends there, dosflash.exe does not recognise it as a valid BIOS update file.

(While the command shown above updates the EC, I will note here that it is also possible to update the BIOS in a similar way, by omitting /ipf ec and by specifying the FL1 file instead of the FL2 file: dosflash /sd /file GCETA3WW\$01DA000.FL1
Of course I recommend using the the normal manufacturer-recommanded BIOS upgrade/downgrade process when possible, but this may be useful if you are in a bind.)

Note that, despite what the output of dosflash.exe says, the actual EC firmware is not updated yet: at this point it has just written the update to an area where it can be picked up by BIOS at boot. Now after reboot the screen displays:

Flashing Embedded Controller…
Please do not power off!

(I appreciate the ‘please’, but really that should read “DO NOT UNDER ANY CIRCUMSTANCES POWER OFF, EVEN IF YOUR HOUSE IS ON FIRE AND SOMEONE HAS STOLEN YOUR PANTS”.)

A few skipped heartbeats later, the firmware update completes.

The moment of truth?

ec

Sure enough, the system is now running my modified embedded controller firmware.

But my battery still isn’t charging. I hook up the SMBus signals to my logic analyser again and the communication looks like this:

...
START 16 (Control Byte: Slave Address B Write) 3C 04 NACK STOP 
START 16 (Control Byte: Slave Address B Write) 3C 04 NACK STOP 
START 16 (Control Byte: Slave Address B Write) 3C 04 NACK STOP 
START 16 (Control Byte: Slave Address B Write) 3C 04 NACK STOP 

It turns out that it isn’t even getting to the authentication check that I had modified, because the earlier command that sends the challenge to the battery is failing: as soon as the laptop sends command 3C and the data length indication 04, the battery is signalling NACK – not acknowledged – go away. So now I modify the state machine so that it proceeds whether or not that write command succeeds (again I do this by replacing a jump by a nop, this time in state 8).

Revision two deployed. Now the system boots with:

The battery installed is not supported by this system and will not charge. Please replace the battery with the correct Lenovo battery for this system. Press the ESC key to continue.

Well, that’s some sort of progress: at least it is no longer displaying the original unauthorised battery message.

I look in BIOS to see where these messages are coming from. Both this message and the original unauthorised battery message are displayed by LenovoVideoInitDxe.efi: don’t ask me why this code is in this module rather than somewhere more relevant (may I suggest LenovoAnnoyingBatteryMessageDxe.efi?), but it might have been convenient to put it in the video initialisation module as the message is displayed when the screen is cleared post-POST.

Anyway, in LenovoVideoInitDxe.efi it reads the battery status from the EC (register 0x38, which we came across in part 2 when decoding the ACPI tables, as well as register 0xd1 which has some additional battery status flags). Depending on certain bits in those registers, it may print one or other message.

Tracing through the EC code to find where those bits are set proves difficult, so I again hook the battery up to my logic analyser. It becomes evident that the sticking point is now the second authentication sequence.

I now make the same changes to the second authentication sequence as I did for the first: I prevent the write command from failing even if the battery NACKs it (state 15), and force the check of the response to succeed (state 19). This is now four changes in total, for each of which I’ve replaced a jump with a nop.

After booting to this final EC firmware revision, my saga comes to an end, almost anticlimactically. My replacement battery works, and I’m getting a good few hours out of it (and no, it hasn’t burst into flames).

Postscript

There is still one very curious open question that I haven’t managed to figure out. There are 10 random-looking bytes at offset 0x200 of the firmware image – the start of what looks like a firmware information block – which are different in every EC revision. So far I haven’t found anything that accesses those bytes, and indeed my EC update works fine even when I leave them unchanged. Probably this is a red herring, but what these bytes are for is still a mystery.

I have uploaded my utilities to GitHub so that it is possible to replicate my work and possibly make other changes to EC firmware. (Edit, Jan 2017: Hamish Coleman has built some additional utilities around my code that may also be useful, see hamishcoleman/thinkpad-ec. If you are only interested in the battery patch you will need to disable his keyboard patches.) The exact changes I made to my X230T firmware are also available as a bspatch file here. However a large disclaimer applies to all of this: do not attempt EC firmware modification at home unless you understand what you are doing. If something goes wrong with an EC update, there is a high likelihood of bricking your laptop, the only recourse being connecting to the EC via JTAG. I will not be held responsible for this. You should also understand that poor quality lithium ion cells can cause fires, as has been seen in the recent spate of hoverboard fires. I will also not be held responsible for this.

I have since also torn down my old Lenovo battery, and I plan to write another post soon with some information and photos. The option of replacing the cells in a genuine battery may be worth considering as an alternative to modifying the EC firmware, the advantage being is that you can choose your own high quality Li-Ion cells versus whatever you might happen to get in a replacement battery. The disadvantage is that it inevitably results in some damage to the battery casing, and as I mentioned before the controller will remember data about the old cells which might affect function of the new cells (I will see what I can figure out on that front).

To be fair, buying a genuine Lenovo battery is probably the best option for most people, at least while Lenovo is still making replacement batteries for this model. Primarily this was an exercise in ‘because I can’: I and a great many readers have enjoyed this process of diving deep into the system architecture of a modern laptop, at a level that few people are normally exposed to, and removing an annoying limitation that I feel I should be entitled to remove from my laptop. I do not have anything against Lenovo specifically – they are certainly not the only vendor who implements vendor restrictions – so please be nice in your comments.

Posted in Computing | 64 Comments

Unlocking my Lenovo laptop, part 2

The embedded controller

In part 1, we looked at the communication between a Lenovo Thinkpad X230T laptop and battery, and discovered that there a challenge-response protocol used to authenticate ‘genuine’ Lenovo batteries. On the laptop side, this – and battery communication in general – is implemented in a component known as the embedded controller (EC). Continue reading

Posted in Computing | 23 Comments

Unlocking my Lenovo laptop, part 1

Introduction

Two months ago, I bought a new battery for my Lenovo laptop (a ThinkPad X230T). I was about to go away on holidays and wanted a battery that could last me through a plane flight; the original battery was by then barely lasting ten minutes. Little did I know that I was about to be in for an adventure. Continue reading

Posted in Computing | 19 Comments

Introduction to photography slides

These are some slide decks I used to use when I ran introductory courses for the UNSW Photography Club. They are a pretty good set of slides so I figured they should have a home on the Web.

Part I: Camera Principles (focal length, ISO speed, shutter speed, aperture, etc.)
Part II: Metering/White Balance

Bonus slide deck: Lighting and Studio Photography

Posted in Other | Leave a comment

Global food security

I went to a great lecture today by Professor Chris Barrett on “The Global Food Security Challenge in the Coming Decades”. The slides from this lecture are available here. Here are my notes:

  • Current global food demand growth is ~1.25% pa, while annual growth in supply has been falling and is now only ~1% pa.
  • This means food prices are now rising (after decades of falling food prices). 2011 was a record profit year for US farmers. This is good news for renewed investment in the agricultural sector, but until supply can be increased, the poorest will suffer.
  • Developing countries have by far the largest effect on food demand. Not only are they growing much faster than developed countries, but a much larger proportion of income increases are spent on food.
  • Currently 85-90% of food is consumed in the country it is produced. However, most arable land in Asia is already used, so rising Asian demand will require large increases in productivity per hectare or large-scale food imports.
  • The remaining unutilised arable land in the world is mostly in Sub-Saharan Africa and Latin America. Huge land grabs by foreign investors are occurring as a result. In many corrupt countries, the proceeds are going to the political classes, while the poor get dispossessed (even in those countries with property rights, many poor are not within the land title system). A 2008 Daewoo deal to lease 1.3 million hectares in Madagascar contributed to the overthrow of the government there.
  • Nutrient deficiencies in the developing world are more severe than energy deficiencies (~15% of population in developing countries are deficient in energy, 31% in Vitamin A, 33% in iodine, 61% in iron). Effects of nutrient deficiencies on intellectual development constitute a poverty trap.
  • Governments everywhere need to invest more in research on productivity-increasing sustainable farming methods, which may or may not include GMOs, to avoid excessive monopolisation of agricultural technology vital to food security. Patent reform may be required.
Posted in Other | 1 Comment

Weather balloon physics

One of the simplest solutions for sending measurement instruments up into the stratosphere is a rubber balloon filled with hydrogen or helium. While the physics of such a balloon would seem to be simple, there are actually some interesting considerations.

Continue reading

Posted in Physics | 10 Comments

New site

Welcome to the newly redesigned zmatt.net. I seem to only get around to upgrading my personal website once every decade so this is a special day indeed. The biggest change is that I now have a blog here (powered by WordPress). I used to blog on LiveJournal but I was seduced by short attention span media like Facebook, I’m trying to restart the blogging habit.

* 30% more Web 2.0 not guaranteed

Posted in Other | Leave a comment