100 Gbps IM/DD links using quad-polarization: Performance, complexity, and power dissipation

Saldaña Cercos, Silvia; Piels, Molly; Estaran Tolosa, Jose Manuel; Usuga Castaneda, Mario A.; Porto da Silva, Edson; Fagertun, Anna Manolova; Tafur Monroy, Idelfonso

Published in:
Optics Express

Link to article, DOI:
10.1364/OE.23.019954

Publication date:
2015

Document Version
Publisher's PDF, also known as Version of record

Link back to DTU Orbit

Citation (APA):

General rights
Copyright and moral rights for the publications made accessible in the public portal are retained by the authors and/or other copyright owners and it is a condition of accessing publications that users recognise and abide by the legal requirements associated with these rights.

- Users may download and print one copy of any publication from the public portal for the purpose of private study or research.
- You may not further distribute the material or use it for any profit-making activity or commercial gain
- You may freely distribute the URL identifying the publication in the public portal

If you believe that this document breaches copyright please contact us providing details, and we will remove access to the work immediately and investigate your claim.
100 Gbps IM/DD links using quad-polarization: Performance, complexity, and power dissipation

S. Saldaña Cercós,* M. Piels, J. Estarán, M. Usuga, E. Porto da Silva, A. Manolova Fagertun, and I. Tafur Monroy

Department of Photonics Engineering, Technical University of Denmark (DTU), Kgs. Lyngby, Denmark

*ssce@dtu.fotonik.dk

Abstract: A computational complexity, power consumption, and receiver sensitivity analysis for three different scenarios for short-range direct detection links is presented: 1) quad-polarization, 2) wavelength division multiplexing (WDM), and 3) parallel optics. Results show that the power consumption penalty associated to the quad-polarization digital signal processing (DSP) is negligibly small. However, the required analog to digital converters account for 47.6% of the total system power consumption. Transmission of 4x32 Gbps over 2 km standard single mode fiber is achieved with a receiver sensitivity of 4.4 dBm.

© 2015 Optical Society of America

OCIS codes: (060.0060) Fiber optics and optical communications; (060.2360) Fiber optics links and subsystem; (060.4260) Multiplexing.

References and links


1. Introduction

With the dot-com bubble in the late '90s came the boom of data centers [1]. However, with current traffic growth and the increase of the number of services that are powered by data centers, optical data links are suffering from bandwidth limitations. High capacity short range links are, therefore, of interest of research both in academia and at the industry level.

Enabling technologies to cope with high bitrate requirements include wavelength division multiplexing (WDM) and parallel optics; however, current traffic demands challenge the achievable net bitrate by WDM architectures and enhances the constraints introduced by the cabling overhead when using parallel optics. Thus, alternative solutions such as higher order modulation formats and space division multiplexing have been proposed. Higher spectral efficiency for data links has been explored employing 4 levels pulse amplitude modulation (4-PAM) [2], duo-binary modulation [3], and even more complex modulation formats such as discrete multitone (DMT) modulation [4] or carrierless amplitude phase (CAP) modulation [5].

This spectral efficiency increase comes at the expense of computational complexity, thus cost and power consumption [3]. Consequently, polarization multiplexing solutions had been proposed as an enabling technology for the deployment of high speed transmission intensity modulation direct detection (IM/DD) systems. In [6], self-coherent direct detection with a spectrally separated frequency reference is used for polarization-multiplexed transmission of an orthogonal frequency division multiplexed (OFDM) signal. The architecture also uses different frequency-orthogonal bands for each polarization to suppress crosstalk. Further improvements in terms of complexity are achieved by incoherent direct detection links [7]. Incoherent direct detection links are achieved by employing a Stokes receiver as reported in [8], and [9]. Based on the Stokes receiver structure from [7], in [10] a technique that allows for four independent data streams transmission on four different SOPs, quad-polarization, is presented.

The main goal of this work is to present a fair comparison between quad-polarization and the two competitive technologies: parallel optics, and WDM. In this paper an analysis of performance in terms of bit-error-rate (BER), digital complexity introduced by the digital signal processing (DSP) at the receiver, and power dissipation of the quad-polarization technique is reported as potential alternative to WDM and parallel optics. The results reported are based on 4-polarization multiplexing, 4-channel WDM, and 4-optical links at 32 Gbps transmission rate to achieve 100 Gbps net bitrate accounting for 28% forward error correction (FEC) overhead. Results are evaluated at the standard FEC limits (i.e., 7% and 20%), but since transmission was achieved up to 32 Gbps per link, this allows for a higher FEC threshold and better sensitivity.
The rest of the paper is structured as follows: Section 2 provides a description of the quad-polarization concept, the requirements at the transmitter and receiver, receiver design, and required DSP. Section 3 determines the computational complexity of each of the blocks needed at the receiver. Power consumption results are reported in Section 4. Performance analysis in terms of bit error rate based on experimental results is presented in Section 5. Concluding remarks and discussion are summarized in Section 6.

2. Quaternary polarization multiplexing digital signal processing

This section presents a detailed description of: 1) the quad-polarization concept and the system testbed used, 2) the requirements at the transmitter that allow a quad-polarization system to operate without ambiguities at the receiver, 3) the receiver used and the motivation for using it, and 4) the DSP stages required to demodulate the data streams carried by four the states of polarization (SOPs).

The general idea of quad-polarization multiplexed systems consists of transmitting four data streams simultaneously over a single media (standard single mode fiber (SSMF)) using four different SOPs instead of using only the two conventional linear and orthogonal polarizations (X and Y). In previously demonstrated IM/DD systems a maximum of three linear SOPs were simultaneously transmitted and received without ambiguity [11], since only three independent polarization parameters were read given that phase was not recovered: the amplitude of the projections on the two perpendicular polarization planes, and the rotation angle. However, in [10] the optical phase is indirectly exploited to encode another independent stream in a circular SOP, thus increasing the multiplexing order by one.

Figure 1 presents the block diagram of the testbed for the quad-polarization system. At the transmitter four distributed feedback lasers (DFBs) are used as light sources. The center frequency of each of the DFBs is spaced 100 GHz (denoted by DFB A= 193.2 THz through DFB D = 193.5 THz) to ensure incoherent power addition. With a monolithically integrated quad-polarization modulator, a single transmit laser could be used as the phase relationship between the four branches could easily be stabilized. In this case, the quad-polarization approach would increase system spectral efficiency. The DFBs feed four integrated Mach-Zehnder modulators (MZMs). The MZMs are driven by four independent 32 Gbd non-return-to-zero (NRZ) electrical signals with 3 Vpp. Each signal is derived from pseudorandom bit sequences (PRBS) of length 2^15 – 1, decorrelated with electrical delay lines. Four polarization controllers (PCs) are used at the output of the MZMs to obtained the four desired SOPs (i.e., X, Y, 45°, and left-circular (LC)). Polarization multiplexing is performed using a standard 4:1 optical coupler. Before fiber transmission, a power equalization stage is required. The power equalization stage

Fig. 1. Schematic of the experimental setup for a 4-SOP 128 Gbps transmission over 2 km standard single mode fiber. DFB, distributed feedback laser; DSP, digital signal processing; PD, photodiode; PPG, pulse pattern generator; WDM, wavelength division multiplexing.
compensates for the varying MZM insertion losses at the transmitter. This is achieved using an erible-doped fiber amplifier (EDFA) and a variable optical attenuator (VOA) (no net optical gain is provided through the power equalization stage), though it could also be achieved by manually adjusting the output powers of the lasers. Then the signal is launched into the 2 km SSMF for transmission.

In order to be able to demultiplex the four SOPs after transmission two key steps are taken: 1) compensation for the rotation of the SOPs when transmitted over SSMF, 2) demodulation in the digital domain. The Stokes analyzer and low-complexity DSP in the receivers are used to allow the transmission of 4-SOP optical signals.

Compensation of the Poincaré sphere rotation is achieved by using a polarization tracking algorithm and Stokes analyzer together with DSP at the receiver [7]. For quad-polarization, the work presented in [7] is extended by tracking three Stokes vectors instead of two. Thus, the rotation matrix of the fiber can be extracted, enabling signal demodulation of four independent SOPs. Details on the tracking algorithm are presented in Section 3. The polarization scrambler depicted in Fig. 1 at the end of the link is used for survey purposes. At this stage the polarization tracking algorithm performance is evaluated.

The Stokes analyzer used at the receiver is composed of four branches with four independent photodetectors which measure S0, S1, S2, and S3. Where S0 is the instantaneous total power, and S1, S2, and S3 are defined by Eqs. (1)–(3), respectively.

\[
\begin{align*}
S_1 &= 2I_{45^\circ} - S_0 \\
S_2 &= 2I_{45^\circ} - S_0 \\
S_3 &= 2I_{Lc} - S_0
\end{align*}
\]

At the first branch of the Stokes analyzer S0 is determined. An eye diagram of the quad-polarization signal received by the S0 photodiode is shown in Fig. 2(a). The remaining three branches use polarization controllers and polarizers to align to 1) X or Y defined as \(I_{X|Y}\) to measure S1, 2) 45° or 135° defined as \(I_{45^\circ|135}\) to measure S2, and 3) right-circular or left-circular defined as \(I_{RC|LC}\) to measure S3, respectively. Figures 2(b), (c), and (d) show the eye diagrams of the received signals by S1, S2, and S3 photodiodes, respectively. The signals received by these detectors change as a function of received state of polarization, but like the S0 eye, have multiple levels. After photo-detection the signal is sampled by a 80 GSa/s digital real time sampling scope (DSO) with 25 GHz bandwidth, and processed offline using DSP. The net insertion loss (IL) of this approach is between 1.2 and 3 dB for the system presented in this work, depending on the fiber rotation. However, there can be a theoretically lossless system if an additional photodetector is used [9, 12].

![Fig. 2. Eye diagrams for the quad-polarization signal received by (a) the S0 photodiode, by (b) the S1 photodiode, by (c) the S2 photodiode, and by (d) the S3 photodiode.](image-url)
The second step consists of demodulation of the signal in the digital domain. Figure 3 illustrates the three different stages required in the DSP at the receiver to successfully demodulate the data from the 4-SOP IM/DD system. In stage 1, general front-end corrections are performed. In stage 2 SOP tracking occurs, and finally in stage 3 the signal is demodulated.

Front-end correction includes error estimation (i.e., timing error correction) and resampling to the minimum number of integer samples per symbol. The resampling process is done in three steps: 1) interpolation, 2) a joint matched and anti-aliasing filtering, and 3) decimation. Note that for parallel optics and WDM systems a bit error rate tester (BERT) can be used at the receiver side eliminating the need for ADCs. In case of doing front-end corrections in the digital domain, then only one bit is required for the ADC reducing its power consumption by a factor four compared to the quad-polarization system.

After front-end corrections are performed, the intensities are transformed to the Stokes parameters. In the Stokes space, propagation along the fiber can be described as a rotation. Tracking the received Stokes vectors amounts to tracking this rotation matrix.

Finally, at the third stage of the DSP, Stokes to intensity transformation and de-mapping are performed. Demodulation consists of a 4x4 multiple-input-multiple-output (MIMO) process. The main benefit in terms of computational load is that the mapping from the transmitter SOPs to the Stokes space and its inverse are known in advance as presented in Eq. (4)

\[
\begin{bmatrix}
I_x \\
I_y \\
I_{45} \\
I_{RC}
\end{bmatrix} = \left( M_{demux} \right)^* \begin{bmatrix}
S0 \\
S1 \\
S2 \\
S3
\end{bmatrix} \tag{4}
\]

where \( I_x, I_y, I_{45}, \) and \( I_{RC} \) are the transmitted intensities, and \( M_{demux} \) is the 4x4 demultiplexing matrix from Eq. (5).

Thus, the combined demapping and derotation process is described in Eq. (6)

\[
\begin{bmatrix}
I_x \\
I_y \\
I_{45} \\
I_{RC}
\end{bmatrix} = \left( M_{demux} \right)^* \left( M_{rot}^T \right) * \begin{bmatrix}
S0 \\
S1 \\
S2 \\
S3
\end{bmatrix} \tag{6}
\]

where \( M_{rot} \) can be obtained from tracking three transmitted Stokes vectors according to Eq. (7).

\[
M_{rot}^{-1} = M_{rot}^T \text{ because } M_{rot} \text{ is an orthogonal matrix. Consequently, no matrices need to be inverted in real time, reducing the digital processor requirements in terms of speed and power.}
\]

\[
\begin{bmatrix}
1 & 0 & 0 & 0 \\
0 & S_{11} & S_{12} & S_{13} \\
0 & S_{21} & S_{22} & S_{23} \\
0 & S_{31} & S_{32} & S_{33}
\end{bmatrix} \tag{7}
\]
consumption.

3. Computational complexity analysis

In this section a computational complexity analysis for the aforementioned required DSP for the quad-polarization system is detailed. The computational complexity for each of the blocks described in Section 2 is measured by breaking each block down into number of operations (i.e., number of real multiplications and real additions) required for each of them.

For the general front-end correction, timing error detection is done using the Gardner algorithm. The error for the Gardner algorithm is computed using Eq. 8 [13] which accounts for one real addition and one real multiplication per symbol.

$$e_n = (y_n - y_{n-2}) \cdot y_{n-1}$$  \hspace{1cm} (8)

Resampling is performed in three different steps. First, interpolation, then a low pass filter is applied, and finally, decimation is conducted. After decimation a matched filter is used. Table 1 summarizes the number of operations required for the timing error detector based on the work presented in [14], and for each of the three resampling steps.

<table>
<thead>
<tr>
<th>Action</th>
<th>Num. of operations/s</th>
<th>Nomenclature</th>
</tr>
</thead>
<tbody>
<tr>
<td>Timing error detection</td>
<td>mult, add</td>
<td>Baud \cdot Nspsym, Baud-rate; Nspsym, Number of samples per symbol</td>
</tr>
<tr>
<td>Interpolation</td>
<td>mult, add</td>
<td>2 \cdot \text{Baud} \cdot \text{Nspsym}</td>
</tr>
<tr>
<td>Low-pass filter</td>
<td>mult, add</td>
<td>N_i \cdot \text{Baud} \cdot \text{Nspsym}</td>
</tr>
<tr>
<td>Decimation</td>
<td>mult, add</td>
<td>\frac{(D-1)}{D} \cdot \text{Baud} \cdot \text{Nspsym}</td>
</tr>
</tbody>
</table>

Based on the work presented in [15], one-multiply form linear interpolation is assumed for resampling. One-multiply form linear interpolation has a computational complexity of one real multiplication and two real additions per sample of output. The number of samples of output is the baud-rate times the number of samples per symbol. After interpolation a an anti-aliasing low pass filter is used to suppress the "ghost frequencies". The computational complexity depends on the number of filter taps ($N_i$), which corresponds to the filter length (i.e., order) minus one filter tap. One real multiplication and one real addition are used to implement a real filter tap. Notice that no complex taps are used for filtering, thus reducing the computational complexity by at least a factor three. Decimation is the last step for resampling. The computational
complexity depends on the decimation factor $D$, which determines how many data points are eliminated. A decimation factor of three indicates that two out of three samples is eliminated. For the work presented here, one real multiplication for eliminated data point (i.e., $\frac{(D-1)}{D}$), and one addition per stored sample (i.e., $\frac{1}{D}$) is accounted for. The decimation factor used for the quad-polarization scenario was the sampling rate. Analogous to the anti-aliasing filter, the matched filter computational complexity is proportional to the number of the filter real taps ($N_2$).

After the general front-end correction, SOP tracking is performed. Fig. 4 illustrates the required sub-blocks for the SOP tracking algorithm. The DSP described here, as well as the tracking algorithm, are based on the work presented in [7]. The algorithm described in [7] is implemented for two SOPs (the conventional orthogonal X and Y polarizations). This same algorithm is extended for quad-polarization by carefully defining the transmitted Jones vectors, as well as the receiver design and DSP described in Section 2.

The first step in stage 2 for the polarization tracking is to generate the Stokes vectors. For that, Eqs. (1)–(3) are used. $S_0$ from the Stokes vectors indicates the instantaneous total intensity. As presented in Table 2, generating the Stokes vectors requires three multiplications and three additions (one for each branch of the Stokes analyzer) per sample. Note that the number of samples per symbol is reduced to one after decimation in the front-end corrections.

Once the signal has been transformed to the Stokes environment, an intensity discriminator is required. It discriminates zero-power symbols in order to ensure that no mathematical indeterminates will appear during the normalization process. In order to define the zero-power symbols, a threshold $S_{th}$ is established so that all $S_0(n)$ symbols with intensity below the threshold level are de-mapped and removed from the four Stokes sequences. This step requires one addition per symbol. Then, normalization of the Stokes vectors by $S_0(n)$ is performed. This operation maximizes the polarized components power. This process requires one multiplication per symbol.

Next, an iterative process is performed where an amplitude discrimination and a reference update take place. This process tracks the Stokes vectors associated with two or three transmitted signals. This tracking algorithm is based on the work presented in [7] and it is performed in four steps: 1) define two or three reference unit Stokes vectors $v(n)$, 2) determine the normalized Stokes vector along the direction of the reference Stokes vector as presented in Eq. (9),
Table 2. Number of Real Additions and Real Multiplications for Stage 2: SOP Tracking

<table>
<thead>
<tr>
<th>Action</th>
<th>Num. of operations/s</th>
<th>Mathematical expr.</th>
</tr>
</thead>
<tbody>
<tr>
<td>Generate Stokes</td>
<td>mult</td>
<td>3 · $\text{Baud}$</td>
</tr>
<tr>
<td>vectors add</td>
<td></td>
<td>3 · $\text{Baud}$</td>
</tr>
<tr>
<td>Intensity discriminator mult</td>
<td>0</td>
<td>$S_0(n) - S_{th} &lt; 0$</td>
</tr>
<tr>
<td>add</td>
<td>$\text{Baud}$</td>
<td></td>
</tr>
<tr>
<td>Normalization</td>
<td>mult</td>
<td>$\text{Baud}$</td>
</tr>
<tr>
<td>add</td>
<td>0</td>
<td></td>
</tr>
<tr>
<td>Amplitude discriminator mult</td>
<td>2 · $\text{Baud}$</td>
<td>$u(n) - u_{th} &lt; 0$</td>
</tr>
<tr>
<td>add</td>
<td>2 · $\text{Baud}$</td>
<td></td>
</tr>
<tr>
<td>Reference updater</td>
<td>mult</td>
<td>3 · $\text{Baud} \cdot \delta$</td>
</tr>
<tr>
<td>add</td>
<td>4 · $\text{Baud} \cdot \delta$</td>
<td></td>
</tr>
<tr>
<td>Threshold updater</td>
<td>mult</td>
<td>2 · $\text{Baud} \cdot L_{th}$</td>
</tr>
<tr>
<td>add</td>
<td>3 · $\text{Baud} \cdot L_{th}$</td>
<td></td>
</tr>
</tbody>
</table>

$^a$Baud, number of samples after decimation (i.e., Baudrate); $L_{th}$, number of threshold updates; $\delta$, number of reference updates.

$u(n) = \left[ \frac{S(n)}{S_0(n)} \right] \cdot v(n)$  

3) define two or three thresholds $u_{th_1}$ and $u_{th_2}$ which are used to determine which tributary the measured sample belongs to (i.e., which SOP is used at the transmitter side). If $u(n) \geq u_{th_1}$ then the measured sample belongs to the first tributary, if $u(n) \geq u_{th_2}$ then it belongs to the second tributary, etc., and 4) update the reference Stokes vector based on which tributary the measured sample belongs to. Equation (10) indicates the reference Stokes vector update when the measured sample belongs to the tributary $i$, where $\mu$ is the step-size parameter.

$v_i(n+1) = \frac{v(n) + \mu \left( \frac{S(n)}{S_0(n)} - v(n) \right)}{\left| v(n) + \mu \left( \frac{S(n)}{S_0(n)} - v(n) \right) \right|}$  

The case where $u(n) < u_{th_1}$ and $u(n) < u_{th_2}$ indicates that the transmitted SOP is not one of the SOPs being tracked and thus, $v(n+1)$ is not updated. The number operations needed for the reference updater is based on how often the Stokes vector is updated ($\delta$ from Table 2). Based on Eq. 10 the number of real additions and real multiplications is 4 and 3, respectively.

If the three tracked vectors are the signals transmitted in X, 45, and right-circular polarizations, the rotation matrix of the fiber can then be determined from the three tracked vectors according to the Eq. (7). The third vector can either be tracked using the same algorithm as $v_1$ and $v_2$, or from their cross product (i.e., $v_3 = v_1 \times v_2$). The performance of both options is similar. The quad-polarization constellation consists of 16 distinct symbols. For each tracked vector, only one of these is used. Assuming that the data is independently and identically distributed, this corresponds to an update parameter $\delta$ of 2/16 for tracking two vectors and 3/16.
for tracking three vectors. Thus, from the two possible flows aforementioned, in case of obtaining \( v_3 \) from cross-product, the parameter \( \delta \) is \( 2/16 \), and 6 additional multiplications and 3 additions are required for the cross product. In case of tracking the three vectors \( v_1, v_2, \) and \( v_3 \), then the parameter \( \delta \) is \( 3/16 \). For the calculations presented in Section 4 the second flow is used (i.e., tracking three vectors). The reader is referred to [7] for further details on the mathematical formulation of the tracking algorithm.

Additionally, the two (three) thresholds, \( u_{th_1} \) and \( u_{th_2} \) (or \( u_{th_3} \)) need to be updated occasionally to assure convergence. Threshold update requires three additions (to find which signal is the strongest among the ones received by S1, S2, and S3), and two multiplications (one to normalize, and one to multiply by a tuning factor). Threshold update occurs \( 1/100 \) of the time as presented in Table 2. This value has been chosen empirically.

Finally, stage 3 deals with demodulation. For this Stokes to intensity transformation and de-mapping are performed. These two processes can be performed with a single matrix multiplication times the received vector \( u(n) \) as presented in Eq. (6) [10]. Table 3 summarizes the number of operations required for demodulation, which enclose \( 4 \times 4 \times 4 \) multiplications and \( 3 \times 4 \times 4 \) additions for demultiplexing.

| Table 3. Number of Real Additions and Real Multiplications for Stage 3: Demodulation |
|-----------------------------------|----------------|-------------|
| Action                           | Num. of operations | Mathematical expr. |
| Demultiplexing                   | mult 64 \( \cdot \) Baud | Eq. (6) |
|                                  | add 48 \( \cdot \) Baud |

4. Power consumption

Power consumption is currently an important criterion to evaluate the feasibility of 100 Gbps IM/DD links [16]. Consequently, this section provides first, a power consumption analysis based on the number of operations for each DSP block. Second, in order to define the power consumption for each of the scenarios, the system layouts for quad-polarization, WDM, and parallel optics are described, providing with an overview of all optical and electronic components needed for \( 4 \times 32 \) Gbps transmission. Finally, numerical results are reported.

4.1. Quad-polarization DSP energy consumption

The number of real additions and multiplications depends on the specific DSP implementation. The authors would like to emphasize that DSP has not been optimized for power consumption and consequently, results here present serve as an overview and estimate rather than exact and optimized estimates. The total power consumption also depends on the specific application-integrated circuit (ASIC) used. The numerical results hereby presented are based on a commercially available ASIC to provide a numerical approximation rather than an exact estimate since the ASIC power consumption can be reduced by modifying the frequency of operation or the supply voltage of the CMOS technology [17]. An ASIC based on 90 nm CMOS process technology is considered. Nortel provides detailed information for one of their ASIC designs which is suitable for 32 Gbd and consumes 1.5 pJ and 0.5 pJ per real multiplication and real addition, respectively [18]. These values have been used for power consumption evaluation of the quad-polarization DSP. The energy consumption associated to the DSP is described in Eq. (11).

\[
E_{DSP} = E_{\text{front-end}} + E_{\text{track}} + E_{\text{dem}}
\]
Where $E_{\text{front-end}}$, $E_{\text{track}}$, and $E_{\text{dem}}$ are the energy associated to the front-end corrections, SOP tracking, and demodulation, respectively. The energy for each DSP block has been calculated by accounting for the number of multiplications and additions described in Section 3 and the average power consumption per real multiplication and additional of the aforementioned ASIC.

Table 4. Modelling Parameters - Power Consumption

<table>
<thead>
<tr>
<th>Parameter</th>
<th>Definition</th>
<th>Value</th>
</tr>
</thead>
<tbody>
<tr>
<td>Baud</td>
<td>Baudrate</td>
<td>$32 \cdot 10^9$</td>
</tr>
<tr>
<td>$N_{\text{sym}}$</td>
<td>Number of samples per symbol</td>
<td>16</td>
</tr>
<tr>
<td>$N_1$</td>
<td>Anti-aliasing filter taps</td>
<td>640</td>
</tr>
<tr>
<td>$N_2$</td>
<td>Matched filter taps</td>
<td>2</td>
</tr>
<tr>
<td>$D$</td>
<td>Decimation factor</td>
<td>$80 \cdot 10^9$</td>
</tr>
<tr>
<td>$\delta$</td>
<td>Number of reference updates</td>
<td>3/16</td>
</tr>
<tr>
<td>$L_{th}$</td>
<td>Number of threshold updates</td>
<td>1/100</td>
</tr>
<tr>
<td>$F_A$</td>
<td>ADC figure of merit</td>
<td>$2.5 \cdot 10^{-12}$ J/conv-step</td>
</tr>
<tr>
<td>$n_{\text{adc}}$</td>
<td>ADC resolution</td>
<td>4 bits</td>
</tr>
<tr>
<td>$DFB$</td>
<td>Distributed Feedback laser</td>
<td>2.5 W</td>
</tr>
<tr>
<td>$MZM$</td>
<td>Mach-Zehnder modulator</td>
<td>832 mW</td>
</tr>
<tr>
<td>$PD$</td>
<td>Photo-detector</td>
<td>25 mW</td>
</tr>
</tbody>
</table>

Table 4 summarizes the parameters used and their values for the DSP power consumption calculation as well as the specifications of the opto-electronic components. The values are based on the experimental demonstration reported in [10]. Results on DSP power consumption are presented in Table 5. For WDM and parallel optics DSP it is assumed only front-end corrections are required. Implementations of both technologies that do not use DSP are possible. Power consumption analysis for both scenarios is provided in Section 4.3. Quad-polarization DSP power consumption is divided in two sections: 1) the initialization process for the SOP tracking algorithm (i.e., SOP tracking), and 2) the demodulation process. The power consumption associated to the polarization tracking algorithm depends on how fast the algorithm converges (i.e., what is the number of iterations needed in order to find an accurate estimate for $v(n)$). In [7] a 1000 sample period for the tracking algorithm to converge is presented, and the maximum convergence depth has been reported to be a 2000 sample period [10]. However, for the results presented in Table 5, the use of a training sequence is assumed, reducing the number of iterations to one [19]. For the quad-polarization DSP two additional power consumption contributions have to be considered: the reference update, and the threshold(s) update. These two processes occur 3/16 symbols (when tracking 3 vectors), and 1/100 symbols, respectively. For stage 3 (i.e., demodulation) of the quad-polarization scenario, power consumption is split in two: the demodulation process based on Eq. (6) and the power consumption contribution of the analog to digital converters (ADCs). The power consumption of the ADC is calculated based on Eq. (12) [14].

$$E_{\text{ADC}} = 4F_A n_{\text{adc}} F_s$$

(12)

Where $F_A$ is a figure of merit, $n_{\text{adc}}$ is the nominal ADC resolution (ADC resolution refers
to the physical ADC resolution, and it is considered to be two bits higher than the effective number of bits [14]), and $F_s$ is the sampling rate. As presented in Table 5 the majority of the power consumption, being 658.7 mW, for DSP is associated to the front-end corrections, which are common for the three scenarios. The additional DSP needed for SOP tracking and demodulation in the quad-polarization scenario is negligible (0.4 and 3.8 mW, respectively). However, as many ADCs as SOPs are required, which are one of the most power demanding elements. Each ADC consumes 3.2 W, thus, consuming 12.8 W for the quad-polarization case. One fourth of this power consumption is accounted for ADCs for parallel optics and WDM.

### Table 5. Power Consumption

<table>
<thead>
<tr>
<th>Stage</th>
<th>Power consumption</th>
</tr>
</thead>
<tbody>
<tr>
<td>WDM</td>
<td>Front-end correction</td>
</tr>
<tr>
<td>Parallel optics</td>
<td>SOP tracking</td>
</tr>
<tr>
<td>Quad-polarization</td>
<td>Demodulation</td>
</tr>
<tr>
<td></td>
<td>ADC</td>
</tr>
</tbody>
</table>

#### 4.2. System layout

The power consumption analysis of the three scenarios includes the contribution of the required optical and electronic components for each scenario. Thus, in this section the testbed used
for the WDM system and parallel optics are described and compared to the quad-polarization testbed described in Section 2.

Figure 5 presents the block diagram for (a) WDM and (b) parallel optics. The first part of the transmitter side, which was described in Section 2 is common for the three scenarios. For the WDM subsystem no power equalization stage is required before transmission. However, it is maintained in the setup in order to assure the same transmitted signal-to-noise ratio (SNR) as for the quad-polarization case. The four WDM signals are multiplexed with a standard 4:1 optical coupler.

After transmission and wavelength demultiplexing in an arrayed waveguide grating, the NRZ signals are detected by four PDs and sampled by the DSO described above. The stored signals are processed offline using DSP to perform front-end correction, demodulation, and BER evaluation. In the parallel optics scenario the signals are sent independently through four independent fibers. The experimental setup was implemented using a single SSMF and a switch to take the measurements from channel A to channel D as shown in Fig. 5(b) to assure the same transmission link performance for all the channels. Analogous to the WDM subsystem, the power equalization stage and the PCs at the receiver side are not required in the set-up, although maintained to assure the same transmitted SNR and photodiode insertion loss as for the quad-polarization system in order to offer a fair comparison between the three scenarios.

4.3. Power consumption numerical results

The power consumption from the electrical signal generation, light sources (i.e., DFBs), modulators (i.e., MZMs), and photo-detectors (i.e., PDs) is considered a base contribution common in the three scenarios. The power consumption contribution of the power equalization stage for the quad-polarization scenario is not accounted for in the overall power consumption since the equalizer could be implemented as a passive component. The remaining optical and electronic components for both WDM and parallel optics are passive optics, and thus their contributions to the overall power consumption are negligible. However, it should be noted that for some WDM channel spacings and deployment conditions active temperature control is needed in order to maintain alignment between the transmitter and wavelength demultiplexer. The DSP developed in Section 2 is used to analyze the receiver energy consumption, where only stage 1 and a simple hard decision stage are used for both WDM and parallel optics. Additional polarization tracking and demapping stages are accounted for in the quad-polarization scenario.

For the base power consumption contribution, 2.5 W, 832 mW, and 25 mW are considered

Fig. 6. Quad-polarization components power consumption. ADC, analog to digital converter; DFB, distributed feedback laser; DSP, digital signal processing; MZM, Mach-Zehnder modulator; PD, Photo-detector.
for each light source, modulator, and photo-detector, respectively [14]. Figure 6 presents the contributions of each of the components for the quad-polarization scenario. The additional DSP for polarization tracking accounts for only 2.47% of the total power consumption due to its low complexity, however power consumption is increased by 47.6% due to the use of power-hungry ADCs.

The power consumption and complexity values normalized to the quad-polarization system are presented in Fig. 7(a) and Fig. 7(b), respectively. Results show a power consumption penalty of 37% for the quad-polarization compared to parallel optics and WDM systems due to the more demanding ADCs contribution. Up to 49% increased power consumption is observed when no ADC is used for parallel optics and WDM. However, the increased complexity associated to the DSP remains below 1%.

5. Performance

In this section a transmission performance in terms of bit-error-rate (BER) comparison between quad-polarization, 4-channel WDM system, and parallel optics is presented. The configuration evaluation has been assessed for 4x32 Gbps back-to-back (B2B) and 2 km of fiber type G.652 standard single mode fiber (SSMF) (16.5 ps/nm·km chromatic dispersion, 0.2 dB/km attenuation) transmission. The BER results presented account for the entire system, and the received power in all cases corresponds to the total received power, not the received power per photodetector. In all systems some variation in sensitivity between channels is observed.

5.1. Results

For the quad-polarization case, the BER performance of the system for perfect polarization rotation compensation post-convergence is plotted in Fig. 8 with black square symbols for the B2B case and red circle symbols for the 2 km transmission case. FEC limits for both 7% and 20% are illustrated for reference purposes. BER below FEC limits is obtained both for B2B and 2 km transmission. The receiver sensitivity for 100 Gbps net bitrate assuming 7% FEC overhead is 4.4 dBm. A penalty of 0.5 dB is observed after transmission. The B2B receiver
sensitivity for 20% FEC overhead is 3.7 dBm with 0.5 dB measured penalty after 2 km SSMF transmission. For the WDM system, four 32 Gbd NRZ data signals for each WDM channel were successfully recovered after 2 km transmission. In Fig. 9 the BER for B2B (red circles) and after transmission (black squares) is computed as a function of the input power into the receiver for the entire WDM system. The receiver sensitivity for 100 Gbps net bitrate is -9.5 dBm and -10.2 dBm with no transmission penalty measured assuming 7% FEC and 20% FEC overhead, respectively. Finally, for the parallel optics scenario, results presented in Fig. 9 show a receiver sensitivity for 100 Gbps net bitrate of -10.5 dBm and -11.3 dBm for 7% and 20% FEC overhead, respectively. The 1dB improvement with respect to the WDM case is primarily due to the insertion loss of the AWG.

![BER sensitivity to PD input power for 32 Gbd quad-polarization for B2B (black, square) and 2 km SSFM transmission (red, circle).](image1)

![BER sensitivity to PD input power for 32 Gbd for WDM for B2B (red, circle) and 2 km SSFM transmission (black, square), and for Parallel optics for B2B (green, triangle) and 2 km SSFM transmission (blue, triangle).](image2)
6. Conclusion

This work presents a comparison between quad-polarization, 4-channel WDM, and 4-line parallel optics in terms of computation complexity, power consumption, and system receiver sensitivity. The study presented in this paper includes a detailed analysis on the additional DSP required for the quad-polarization scenario at the receiver, and its power consumption contribution based on the number of operations needed for each of the DSP blocks. Quad-polarization for direct-detection optical subsystems allows the transmission of four parallel data streams using for different states of polarization. This approach has the potential to increase the capacity per channel. However, an increase in capacity comes at expenses of both power consumption and receiver sensitivity. Results show that quad-polarization DSP complexity is very low presenting only 2.47% additional power consumption compared to WDM and parallel optics. However, ADCs are required which account for 47.6% of the system total power consumption.

In terms of BER parallel optics is the system that provides lower receiver sensitivity, -10.5 dBm for 7% overhead FEC with no transmission penalty. For the WDM system 1 dBm penalty is observed, though reducing the footprint (i.e., cabling) by a factor of four. For the quad-polarization system 4.4 dBm receiver sensitivity is obtained, with 0.5 dBm penalty after transmission. This sensitivity penalty is expected since it represents the comparison between a single level approach versus a multi-level alternative. Better sensitivity could be achieved with dual polarization solutions.

Quad-polarization combined with WDM, and/or parallel optics offers an alternative solution for 400 Gbps low complexity short-reach optical transmission systems since it enables bitrate quadrupling for each laser-photodiode-ADC lane but comes with the costs in performance typically associated with multilevel modulation formats. Additionally, quad-polarization can be considered as a competitor solution to recent standarization activities such IEEE 802.3 standard, 400 Gigabit Taskforce adopted PAM 4 for 100 Gbps per wavelength. Relative performance, complexity, and power dissipation of the quad-polarization scheme using 100 Gbps PAM 4 as a reference for comparison would be an interesting topic for future research.