Optics Series Lecture, Lecture – V.
“Matrix formulation in Geometrical Optics.” This lecture was delivered on 3rd February in a lecture session of 1 and 1/2 hours.
In this lecture, we will discuss about one of the most interesting and powerful methods in Geometrical Optics. As we have discussed, geometrical optics is that segment of optics in which we are limited to a situation when the wavelength of light is negligible eg λ is insignificant compared to the size of the objects light interacts with. As a consequence light can be considered as rays or geometrical straight lines and the nuances of light as wave undulations can be postponed to a happy hour.
Any general optical system has a ray which can be traced through two basic types of traversal of the ray: Translation and Refraction. The law of refraction is thus the central tool for ray-tracing. A ray can be described in an optical system by its coordinates which we will define soon. Our goal is to find the matrix which governs the displacement of the ray from one coordinate to another coordinate of the ray as the ray travels from one geometric point to another. This will enable us to study simple as well as much more complicated systems in the most effective and powerful way as we will see.
Lets discuss the basic matrices available for ray tracing when it travels from one coordinate to another in two cases. I. Translation Matrix for simple straight line motion in a homogeneous medium. II. Refraction Matrix for refraction at the interfaces of two different media. In general therefore the total traversals of the ray can constitute of any number of translations or refraction. A reflection would merely be two translations and a general refraction might be construed from refraction as well as translations.
A most generalized optical system can therefore have n numbers of translations and s numbers of refraction matrices, so to say, in their respective orders. This total matrix would be called a system matrix.
I. Translation Matrix.
Lets us consider a ray that traverses an optical medium in a straight line, at any angle. This is called as a translation, as, the only change the ray undergoes is a translation or straight-line shifting of coordinates. Needless to say this enforces us to realize, the ray is limited to a single homogeneous medium.
The ray would be completely characterized if we would specify the required number of coordinates of a ray howsoever complicated it is. In our case it is simple, a straight-line. Therefore we need to specify coordinates of any two points to completely specify our ray. At any one geometric point on the ray, its coordinate is specified by two variables; height from the axis shown here by P’M’: x1 (see PP’) or x2 (see MM’) and angle from the axis by ψ or α : eg α1 or α2. We defined our ray at two points — hence defined it completely, by specifying two coordinates — at two different geometric points: (x1, α1) and (x2, α2)
We define another convenient variable, the optical direction cosine, λ, its a closer step towards defining optical paths, which makes life far easier, anytime we are dealing with optical situations. That is, the refractive index n, comes in handy even if we are restricted in this particular problem to homogeneous medium. That new variable is given by λ = n cos ψ = n sin α.
In our diagram shown above a ray moves in a straight line in a homogeneous medium hence α1 = α2. Also x2 = x1 + D tan α1. — eqn (i). If we assume paraxial region this would imply: tan α1 = α1. This would further transmogrify — pardon me simply change, they are the same thing, the equations, that we just had explicitly or implicitly: x2= x1 + D.α1 — eqn (ii) λ1 = n1.α1 — eqn (iii) λ2 = n2.α2. — eqn (iv). Again since our medium is homogeneous n1 = n2. Since we are moving in straight line α1 = α2 so we have: These equations (eqn v) can be written in terms of matrices: Thus a ray at coordinate (x1, α1) goes to coordinate (x2, α2) and this motion is given by a matrix known as Translation Matrix T where, . The actual distance traveled by the ray in the homogeneous medium is approximated by D and this is based on our paraxial assumption. We also see that det (T) = 1.
II. Refraction Matrix.
We determined the form of a matrix that would be responsible to carry a ray from its initial coordinate to its final one if the traversal of the ray under consideration is in a single homogeneous medium and follows a straight line. A more involved process is that in which the ray gets refracted upon meeting the interface of two medium with different refractive indices. By analyzing the traversal of the ray we can determine the matrix that governs the ray traversal for refraction. Accordingly we call it a Refraction Matrix.
Lets draw a suitable diagram for this situation. We will consider a single refracting surface and leave the further configuration of the surface unattended. So we can later extend towards configurations which can have various thickness and different secondary surfaces after the single surface on left which is spherical with a curvature radius R.
Since the ray now traverses two media while getting refracted at the interfaces of the media we should invoke the Snell’s law in order to constrain the path of the ray. In addition we are adamant on the paraxial traversal of the ray, that is the ray keeps as close to the optical axis as possible, in height as well as angle, hence sin of angles etc will be approximated to the value of the angles itself, given the angles are already used in terms of radians consistently.
Snell’s law is: n1 sin θ1 = n2 sin θ2 — eqn (i) and the paraxial assumption is: n1.θ1 = n2.θ2 — eqn (ii). Also we write the relations between angles as shown in the diagram so that we can cast everything into our previously defined ray coordinates in terms of (λ, α). So we have: θ1 = φ1+α1 and θ2 = φ1+α2 — eqn (iii) where α1 and α2 are angles made by incident ray and refracted ray with z-axis that is the; optical axis. Also φ1 is the angle made by normal to refracting spherical surface with z-axis. n1 and n2 are the refractive indices of the medium prior to the refracting surface and the medium following the refracting surface respectively.
Our paraxial assumption means φ1 = x/R. — eqn (iv). Thus we have from eqn (ii) and eqn (iii), n1(φ1+α1) = n2(φ1+α2) and using eqn (iv) we have: If we define we would have . P is known as the power of the refracting surface. Now we note that height of ray at point-P before and after refraction is same, so: x2 = x1 — eqn (vi).
III. System Matrix.
In general any optical system is made of series of refracting and translating motions of a ray. We can thus write a general matrix which transforms the coordinates from one point of the configuration to another under our paraxial assumption. where is called as System Matrix. Accordingly a system matrix is a product of any number of possible refraction and translation matrices. Its easy to see that det (S) = 1 as well since det (S) = 1 and det (S) = 1 and S = RT. This implies that the elements of the system matrix in general satisfy the condition bc – ad = 1. b and c are dimensionless, a has dimension of inverse length and d has dimension of length. The matrix in this form is also called the ABCD matrix owing to the elements symbols, a, b, c, d.