0 penilaian0% menganggap dokumen ini bermanfaat (0 suara)
27 tayangan6 halaman
F-BLAST is a parallel version of the well-known vertical BLAST detection algorithm. It offers performance that approaches that of optimal ML at the cost of performing V-BLAST in parallel for all M possible values of the symbol in the layer with the weakest expected signal-to-noise ratio. The data parallel structure of the new detection algorithms, and their simpler control structure compared to SD, should offer implementation advantages.
Deskripsi Asli:
Judul Asli
Enhanced MIMO Detection With Parallel v-BLAST(Official Paper)
F-BLAST is a parallel version of the well-known vertical BLAST detection algorithm. It offers performance that approaches that of optimal ML at the cost of performing V-BLAST in parallel for all M possible values of the symbol in the layer with the weakest expected signal-to-noise ratio. The data parallel structure of the new detection algorithms, and their simpler control structure compared to SD, should offer implementation advantages.
Hak Cipta:
Attribution Non-Commercial (BY-NC)
Format Tersedia
Unduh sebagai PDF, TXT atau baca online dari Scribd
F-BLAST is a parallel version of the well-known vertical BLAST detection algorithm. It offers performance that approaches that of optimal ML at the cost of performing V-BLAST in parallel for all M possible values of the symbol in the layer with the weakest expected signal-to-noise ratio. The data parallel structure of the new detection algorithms, and their simpler control structure compared to SD, should offer implementation advantages.
Hak Cipta:
Attribution Non-Commercial (BY-NC)
Format Tersedia
Unduh sebagai PDF, TXT atau baca online dari Scribd
Department of Electrical and Computer Engineering University of Alberta Edmonton, AB T6G 2V4, Canada {pankeuyo|cockburn}@ualberta.ca
AbstractFoschini quantified the large capacity of the multiple- input multiple-output (MIMO) wireless channel and showed how this capacity could be achieved using a layered and coded space- time architecture. Unfortunately, his proposed diagonal Bell Laboratories Space-Time (D-BLAST) detection algorithm has proved awkward to implement. Attention has instead focused on the simpler vertically-layered architecture. The well-known vertical MIMO detectors, such as zero forcing (ZF), minimum mean squared error (MMSE), maximum likelihood (ML), vertical BLAST (V-BLAST), and several versions of sphere decoding (SD), offer different trade-offs between computational complexity and performance. V-BLAST offers intermediate, but clearly suboptimal performance that has a computational complexity that grows linearly in the number of transmitted layers and in the size M of the symbol constellation. Fouladi Fard, Alimohammad and Cockburn recently proposed a parallel V-BLAST algorithm, which we call F-BLAST, that offers performance that approaches that of optimal ML at the cost of performing V-BLAST in parallel for all M possible values of the symbol in the layer with the weakest expected signal-to-noise ratio. Here we revisit the performance of F-BLAST and show how the degree of parallelism can be reduced while maintaining performance that greatly exceeds that of V-BLAST. The data parallel structure of the new detection algorithms, and their simpler control structure compared to SD, should offer implementation advantages. Keywords-MIMO, V-BLAST, sphere decoding, near-optimal detection, parallel detection. I. INTRODUCTION An n T n R single-user Multiple-Input Multiple-Output (MIMO) communication system has n T > 1 transmitting antennas and n R > 1 receiving antennas [1]. When an n T - element symbol vector s is transmitted over a flat fading radio channel, the n R -element complex received signal vector y can be expressed as the product y = Hs + n, where H is the n R n T channel matrix and n is additive white Gaussian noise (AWGN). It has been established that the capacity of the n T n R
MIMO channel, provided that H is sufficiently white and is accurately known at the receiver, approaches min(n T ,n R ) times the Shannon capacity of a conventional 11 channel [2]. The obvious benefits of this capacity gain have caused MIMO technology to be incorporated in most of the latest wireless standards, including IEEE 802.11n, WiMax and LTE [3]. II. REVIEW OF MIMO DETECTORS In Foschini's original MIMO scheme, the transmitted data is demultiplexed into n T substreams of equal data rate [4]. Each substream is then encoded separately using block encoders. The resulting n T substreams, called layers, are mapped to the n T transmitting antennas by a diagonal encoder that periodically rotates the mapping from the layers to the transmitting antennas. In the receiver, the D-BLAST algorithm uses successive interference cancellation to iteratively detect the symbols in each diagonal layer. Specifically, D-BLAST uses symbol nulling and symbol cancellation to minimize the interference in the present signal vector y caused by the yet-to- be-detected and already-detected symbols, respectively, belonging to different layers but transmitted in the same symbol time. Unfortunately, the rotating mapping used in Foschinis scheme and the diagonal arrangement of the layers in space-time leads to a relatively complicated detection algorithm. Most subsequent work has thus considered simpler vertically-layered MIMO schemes, where each layer lies within the same symbol time. This allows each symbol vector to be recovered from the corresponding single signal vector. A variety of vertical MIMO detection algorithms have been proposed in the literature that offer trade-offs between the detection accuracy and the computational complexity in the receiver. Statistically-optimal detection using the maximum likelihood (ML) algorithm [1] is impractical for most systems since the computational complexity increases exponentially with the number n T of transmit antennas and the number of bits encoded in the transmitted symbol vector [2]. Zero forcing (ZF) and minimum mean squared error (MMSE) detectors offer fast detection that is linear in both n R and n T . They both involve premultiplying y by a conditioning matrix G, computed from the receiver's estimate H' of H, and then slicing each component to the nearest constellation point. The conditioning matrix G is chosen to minimize the inter-layer interference. For ZF detectors where n R = n T (n R > n T ), the conditioning matrix is the inverse (or pseudo inverse) of H' [1]. MMSE detectors achieve better signal detection accuracy than ZF detectors by using a conditioning matrix that exploits both H' and an estimate of the signal-to-noise ratio of the channel. SD detectors search an n T -dimensional hypersphere subset of the universe of all possible signal vectors s. Estimates of the signal power are used to dynamically adjust the size of the hypersphere centered on an initial estimate of s. SD provides near-optimal detection at the cost of a relatively complicated control algorithm that ensures that the hypersphere is sized This work was supported by iCORE and by the Natural Sciences and Engineering Research Council (NSERC) of Canada under grant OGP0105567. 702 978-1-4577-0253-2/11/$26.00 2011 IEEE correctly and searched efficiently [5, 6]. The algorithmic complexity of SD and the variability in the computational load have motivated the development of more easily implemented algorithms derived from SD, such as the tree-search-based K- best algorithm [7, 8]. Other researchers have investigated ways of reducing the complexity of SD by restricting the search space along some dimensions [9]. These SD-inspired detectors, however, remain challenging to implement in silicon because of their algorithmic complexity. The vertical BLAST (V-BLAST) detection algorithm offers greater accuracy than either ZF or MMSE with a computational complexity that grows linearly in the number n T of transmitting antennas [10, 11]. Like D-BLAST, V-BLAST uses an iterative successive interference cancellation strategy. The first symbol to be detected in a signal vector y must be recovered with no knowledge of any other symbol in the corresponding symbol vector s. Moreover, any detection errors in the first symbol will enhance the interference experienced during the detection of the remaining symbols in s, and this could cause error propagation. To minimize the symbol error rate (SER) of the first and subsequent layers, V-BLAST maintains estimates of the post-detection signal-to-noise ratio (SNR) for each layer, and then orders the layers from the greatest to the least estimated SNR. The first symbol, which has the largest expected SNR, is then usually detected following premultiplication by either a ZF or (most commonly) an MMSE conditioning vector, which attempts to null the interference from the n T - 1 other layers. As in D-BLAST, in V-BLAST the receiver is assumed to have an accurate estimate H' of the channel matrix H. Let s denote the column vector comprising n T symbols that arrive in the same symbol time. These n T symbols are combined linearly by H and corrupted with AWGN n to form the n R
complex elements of the received vector y = Hs + n. Given an estimate H' of the channel matrix, one can compute a nulling matrix G as follows [10]: G = (H H H' + ( n 2 / s 2 ) I nT ) 1 H H . (1) where H H denotes the Hermitian of H', s 2 / n 2 denotes the estimated SNR at the receiver, and I nT denotes an n T n T
identity matrix. This definition of G specifies a linear prefilter that, when applied to the received signal vectors y, allows the symbol vectors s to be estimated with minimum mean squared error (MMSE) under the assumption of identically-distributed transmitted symbols; that is, the expectation E{||Gy s|| 2 } is minimized. The norms of the rows of G are inversely proportional to the expected post-detection SNR of the symbols in s, and thus G can be used to re-order the detection of the signal layers from (1, 2, ..., n T ) to some permutation (k 1 , k 2 , ..., k nT ) according to their SNR. (Step 0) Ordering: Define the row ordering (k 1 , k 2 , ..., k nT ) recursively as follows. Define k 1 e {1, ..., n T } to be the row index for G=G 1 (the nulling matrix for H') that identifies the row with the smallest norm. For j > 1, define H j+1 to be the matrix obtained by deleting columns k 1 , k 2 , ..., k j from H' and define G j+1 to be the nulling matrix for H j+1 . Finally, define k j+1
to be the original row index for G that corresponds to the row in G j+1 with the smallest norm. After the row/layer ordering has been determined, let s 1 denote the symbol from layer k 1 that is detected first from y, let s 2 denote the symbol from layer k 2
that is detected second, and so on, with s nT denoting the symbol from layer k nT that is detected last. Without loss of generality, the symbols in s and the columns of H can be permuted so that k 1 =1, k 2 =2, ..., k nT = nT. The detection of symbol s j , where 1 s j s n T , occurs in three steps given an n R -element vector y (j) , where y (1) is the initial signal vector y and y (j) , for j > 2, is a processed signal vector that results from canceling the predicted interference from the j-1 previously detected symbols of s. The first symbol s 1 to be detected does not have a symbol cancellation step, so the three steps in V-BLAST are as follows, for j iterating from 1 to n T : (Step 1) Nulling: Vector y (j) contains interference from symbols s j+1 , . . . , s nT . However, this interference can be minimized by premultiplying y (j) by the nulling vector g j , which is the jth row of G j . (Step 2) Slicing: Symbol s j is detected by selecting the symbol s j that minimizes the difference || g j y (j) s j || over all M pos- sible symbols s j in the constellation. (Step 3) Cancellation: Vector y (j+1) is computed by subtracting H'[s 1 , s 2 , ..., s j , 0, ..., 0] from y. Note that this step is not required after detecting the last symbol s nT . The performance of detection algorithms at high SNRs is usefully characterized by their diversity order, which corresponds to the negative of the slope of the logarithm of the bit error rate (BER) plotted against the logarithm of the SNR per bit [1]. By fully exploiting all available information at the receiver, the ML detector achieves a diversity order of n R , the same as a maximum ratio combiner of the n R received signals. Linear detectors, such as ZF and MMSE, achieve a diversity order of n R n T + 1 [1]. V-BLAST is limited to this same diversity order because a linear detector, usually MMSE, is used to recover the first symbol in each layer [12]. This observation motivated our investigation for ways of further reducing the interference affecting the first layer while maintaining the simple algorithmic simplicity of V-BLAST. Figure 1. Near-optimal performance of F-BLAST [13].
703
Figure 2. Symbol Error Rate vs. SNR for F-BLAST for different parallel-search layers and increasing numbers of antennas.
III. F-BLAST: AN IMPROVED PARALLEL V-BLAST The performance of V-BLAST can indeed be improved by increasing the detection accuracy of the first symbol. Fouladi Fard et al. proposed that this be done by considering all M possible values of the symbol in the weakest layer [13], and then for each value canceling its predicted interference on the signal vector y and performing conventional V-BLAST to detect the n T 1 remaining layers. The detected signal vector is the one s among the M candidates that minimizes the mean squared error ||H s y|| 2 . We will refer to this parallelized and improved V-BLAST as F-BLAST. The computational complexity of F-BLAST is greater than conventional V- BLAST by a factor of roughly M (n T -1)/n T . The three outer plots in Fig. 1 are Symbol Error Rate (SER) vs. SNR curves that were obtained when simulating a 44 MIMO system that transmits 16-QAM symbols from each antenna. The elements of H are complex and Gaussian- distributed with zero mean and unity variance. The received signals are corrupted with AWGN according to the given SNR. The upper plot was obtained with an MMSE detector. The middle plot used a conventional V-BLAST detector. Lastly, the lowest plot was obtained using an F-BLAST detector. The transmitted symbols were formed from 2x10 4 blocks. Each symbol block contains 10 frames, and each frame contains 2M symbol vectors. M denotes the symbol constellation size (e.g., 16, 64, 256). A new matrix H was generated for each frame. To ensure reliable statistics at even high values of SNR, the simulations were allowed to run longer to ensure that each SER point represented >1000 symbol errors. These plots confirm the significantly improved performance of F-BLAST compared to both MMSE and V-BLAST, as reported in [13]. The inset in Fig. 1 compares the performance of an ML detector with F-BLAST. Note that the SER of F-BLAST approaches the optimal performance of ML for SNRs ranging from 0 dB to at least 24 dB. The near-optimal performance of F-BLAST has been confirmed in simulation for MIMO configurations 44, 66 and 88, for constellation sizes M ranging from 16 to 256, and for SNRs from 0 dB up to 40 dB. Figs. 2(a), (b) and (c) show plots that illustrate the simulated performance of F-BLAST for 44, 66 and 88 MIMO systems, respectively, for all possible choices of the exhaustively searched layer. In the plots, W1 denotes the weakest layer, W2 denotes the second weakest layer, etc.; similarly, S1 denotes the strongest layer, S2 denotes the second strongest layer, etc. MI denotes the layer, other than layer S1, that is expected to cause the maximum interference on S1. Let h S1 denote the column of H corresponding to S1. The MI layer is the layer, other than S1, that corresponds to the element of h S1 that has the largest magnitude. F(W2)-BLAST and F(MI)- BLAST refer to F-BLAST where the exhaustively searched layer is W2 and MI, respectively. Note that the best performance was obtained in all cases when the exhaustively searched layer was the weakest layer W1. Henceforth when we refer to F-BLAST we will always mean F(W1)-BLAST. The diversity order of the M parallel V-BLAST subdetectors in F-BLAST will be increased by 1 with respect to a conventional V-BLAST detector for the original problem since the number of interfering layers experienced by the first layer in the most likely to be successful V-BLAST subdetector will be n T - 2 and not n T 1. Instead of using the weakest layer as the exhaustively searched layer in F-BLAST, one could in fact use any of the other layers (including the strongest layer). However, our simulation results confirm that the weakest layer is the best choice for the exhaustively searched layer since, as the channel SNR increases, the interference experienced by the strongest layer (as well as the n T 2 intervening layers) will become increasingly dominated by the interference from the weakest layer rather than from the channel AWGN. By exhaustively searching over the weakest layer, and thus effectively eliminating its interference contribution on the (n T
1) other layers, the near-optimal performance of F-BLAST shows that the overall diversity gain is increased by an additional factor that approaches 2, for a total diversity gain approaching n T (i.e., the same as ML and SD). 704
Figure 3. Search Windows of Size W = 8 (dark grey regions) and 16 (dark grey and surrounding shade) for the QAM Constellation of Size M = 64
Figure 4. Symbol Error Rate vs. SNR for MMSE, V-BLAST, FR-BLAST of Various Reduced Search Windows, and F-BLAST
IV. FR-BLAST: F-BLAST WITH REDUCED PARALLELISM F-BLAST has a convenient parallel structure that could be exploited in implementations (e.g., simplified pipelining and sharing of hardware blocks). Note also that the parallel V-BLAST subdetectors share the same nulling matrix G. However, the M-fold parallelism of the subdetectors would likely be considered expensive in hardware cost and power for M = 16 let alone for 64 or larger. We therefore began investigating the possibility of trading off some of the excellent performance of F-BLAST in an effort to reduce the degree of parallelism in the searched layer, and hence reduce the cost. We call the resulting family of reduced parallelism detectors FR-BLAST (described with early results in [14]). Two questions immediately arise in the design of FR- BLAST. First, should the searched layer continue to be the weakest layer? Second, how should the restricted window of symbols be constructed as a subset of the symbol constellation for the searched layer? Simulation results (obtained after [14]) convinced us that the weakest W1 layer was not the best choice for the parallel searched layer for FR- BLAST, as it is for F-BLAST. Rather, the second weakest W2 and maximum interference MI layers give the best (and very similar) results. We also determined that the W2 and MI layers were different over 65% of the time. The search window position in the constellation can be determined in various ways, but we found that the MMSE estimate for the symbol in the searched layer was a reasonable choice. Given a fixed search window size (i.e., the number of considered symbols), the shape of the search window for each symbol position s X was determined empirically by simulation experiments that collected histograms for the (assumed near-optimal) F-BLAST estimate given that the MMSE estimate was s X . For each histogram, a search window of size W was constructed by selecting the W most likely F-BLAST decisions for each s X . The M windows were then stored in look-up tables. (The number of tables can be reduced by exploiting constellation symmetry.) As one might expect, the resulting optimized windows included points that were relatively close in Euclidean distance to each s X , but the precise window shapes did not have an appreciable effect on FR-BLAST performance. Figure 3 shows the ten unique search windows for W = 8 and 16 for M = 64. The number of windows has been reduced to ten in this figure by exploiting all possibly symmetries. Figs. 4(a), (b) and (c) were obtained with MMSE, various versions of FR-BLAST, and F-BLAST when detecting 64, 128 and 256-QAM symbols in a 44 MIMO system. Note 705 the large and stable diversity order of F-BLAST at high SNRs, which carries on the near-optimal performance that was directly verified in simulation in comparison with ML for the lower SNR values. The different versions of FR- BLAST were obtained by considering the three best choices (W1, W2 and MI) of the searched layer with various search window sizes. For each considered combination of constellation size M and window size, using either W2 or MI as the searched layer produced better performance than W1. For each combination of constellation size M and searched layer, the performance improved as the search window was increased in size. Thus, by increasing the search window size (and hence increasing the degree of parallelism), one can obtain SER performance that lies in between the performance of V-BLAST and F-BLAST. The FR-BLAST plots in Fig. 4 also illustrate how the diversity order in each configuration levels off and approaches the order of V-BLAST as the SNR increases. What appears to cause this effect is that the correct value of the symbol in the searched layer becomes increasingly difficult to predict. The corresponding window histograms, which record the probability of the F-BLAST estimates with respect to each given MMSE estimate s X of the symbol in the searched layer, flatten out as the SNR increases. Thus the interference experienced by the strongest layer (i.e., the first layer detected by the parallel V-BLAST subdetectors) is no longer effectively reduced by the initial symbol cancellation step. Interestingly, F-BLAST avoids the problem, at even the highest SNRs, by simply considering all of the possible symbols in the searched layer. As long as the strongest layer continues to be detected near-optimally, the successive interference cancellation strategy of V-BLAST allows the remaining symbols to be recovered almost as accurately. V. COMPUTATIONAL COMPLEXITY The implementation cost of detectors involves several different interrelated quantities. The required hardware (e.g., the number of adders, multipliers, bits of intermediate storage) can be traded off against the detection latency (processing delay between the arrival of a new signal vector and the output of the detected bits) and the data throughput (the number of bits detected per unit time). The energy per decoded bit is an especially important figure of merit for detectors that are to be used in battery-powered applications. A detailed discussion of the cost models is beyond the scope of this paper. However, Table 1 has been included to illustrate how the new F-BLAST and FR-BLAST detectors compare with MMSE, V-BLAST and ML. The four last table columns express the computational cost, per decoded symbol vector, in terms of the number of real-valued multiplications, additions, reciprocal operations, and the number of clock cycles assuming one operation per clock cycle and assuming the availability of arbitrarily large hardware parallelism. ML detectors for M = 16, 64 and 256 are prohibitively expensive in terms of multiplications and additions. The fully parallel time cost is misleading for this detector since the hardware parallelism would be enormous (proportional to M M M M). In the case of M = 64, F- BLAST requires 4.8 times the real multiplications and 2.8 times the real additions compared to MMSE. With FR- BLAST, restricting the window size to W = 16 (25% of the exhaustive M = 64 search of the weakest layer in F-BLAST) reduces the multiplications and additions to 2.1 and 1.2 times, respectively, the operations required by MMSE. As shown in Fig. 4(a), the symbol error rate of this FR-BLAST detector is about two orders of magnitude lower than that of MMSE for signal-to-noise ratios exceeding 28 dB. Further restricting the FR-BLAST window size to W = 8 (12.5% of M = 64) reduces these relative numbers to 1.7 and 0.9, but it also increases the symbol error rate significantly. VI. CONCLUSIONS We described a restricted search version of the F-BLAST (i.e., parallel V-BLAST) detector for MIMO systems. The resulting FR-BLAST detectors exploit the parallel structure of F-BLAST to gracefully trade off the near-optimal performance of F-BLAST to reduce the computational cost by reducing the degree of parallelism in the symbol layer that is being searched. In F-BLAST, the best choice for the searched layer is the one with the weakest estimated SNR; however, in the reduced parallelism FR-BLAST detector (and unlike the version in [14]), the best choice for the searched layer is either the layer with the second weakest SNR or the layer that is likely to produce the greatest interference to the strongest layer (and these layers are often different). As with F-BLAST, the performance of FR- BLAST scales up well as the number of antennas increases and as the size of the constellation grows. The convenient parallel structure of F-BLAST remains in FR-BLAST, although some additional complexity is required to construct the search windows. However, the fixed search windows can computed off-line in advance and stored in a read-only memory. The parallel structure of FR-BLAST could be exploited in MIMO-OFDM systems, where MIMO decoders are required for a relatively large number of subcarriers. General expressions for the computational complexity of ML, MMSE and F-BLAST appear in [14]. Consider the detection of 64-QAM symbols in a 44 system. The number of real-valued additions, multiplications and reciprocal operations per detected signal vector (ignoring the null matrix computation) rises by factors of 13.7, 12.9 and 50 (from 2 to 100), respectively, when changing the hard decision detector from MMSE to FR(W2,16)-BLAST, and the result is a reduction in the SER by a factor of roughly 40 for SNRs exceeding 30 dB. The complexity of FR-BLAST with respect to F-BLAST is determined mostly by the ratio of the number of parallel V-BLAST subdetectors in FR- BLAST to the size of the symbol constellation. The search windows required by FR-BLAST are precomputed and do not add to the run-time computational complexity but they do require some storage capacity, which can be minimized by exploiting symmetries in the constellation. FR-BLAST computes the same nulling matrix as F-BLAST, and in both cases this cost is incurred each time the receiver updates its estimate of the channel matrix. The work on FR-BLAST is being extended in several different directions. Selection diversity could be exploited to combine the outputs of parallel FR(W2)-BLAST and 706 FR(MI)-BLAST detectors. The rule for selecting the best layer to searched (for each updated channel matrix estimate) could be made more sophisticated than simply selecting the second weakest W2 or maximum interference MI layers. In particular, for the highest SNRs it might be more appropriate to choose the layer that has window histograms where, with high probability, the optimal estimate of the symbol in the searched layer will be well covered by the search window (i.e., the layer where the profiles of the window histograms are a good fit with the available search window size). Finally, we are extending F-BLAST and FR-BLAST to produce soft outputs that can then be decoded with an iterative decoder (e.g., a standard Turbo decoder). In the longer term it will be important to determine how the complexity of the total system (soft detector and soft bit decoder) and the expected energy per decoded bit are affected by the use of parallel V-BLAST detectors. ACKNOWLEDGMENT The authors wish to thank Drs. Saeed Fouladi Fard and Amirhossein Alimohammad for access to MATLAB simulation modules and for fruitful discussions concerning the implementation and performance of F-BLAST. REFERENCES [1] J. G. Proakis and M. Salehi, Digital Communications, 5 th ed., New York, NY: McGraw-Hill, 2008. [2] M. Sellathurai and S. Haykin, Space-time Layered Information Processing for Wireless Communications, Hoboken, NJ: John Wiley & Sons, 2009. [3] S. Sesia, I. Toufik, and M. Baker, LTE: The UMTS Long Term Evolution: From Theory to Practice, Hoboken, NJ: John Wiley & Sons, 2009.
[4] G. J. Foschini, Layered Space-Time Architecture for Wireless Communication in a Fading Environment When Using Multiple Antennas, Bell Laboratories Technical Journal, vol. 1, no. 1, pp. 41- 59, Autumn 1996. [5] M. O. Damen, H. E. Gamel, and G. Caire, On maximum-likelihood detection and the search for the closest lattice point, IEEE Trans. Inf. Theory, vol. 49, no. 10, pp. 2389-2402, Oct. 2003. [6] B. Hassibi and H. Vikalo, On the sphere-decoding algorithm II. Generalizations, second-order statistics, and applications to communications, IEEE Trans. Signal Process., vol. 53, no. 8, pp. 2819-2834, Aug. 2005. [7] Z. Guo and P. Nilsson, Algorithm and implementation of the K-best sphere decoding for MIMO detection, IEEE Sel. Areas Commun., vol. 24, no. 2, pp. 491-503, Mar. 2006. [8] M. Shabany and P. G. Gulak, A 0.13um CMOS 655Mb/s 4x4 64- QAM K-Best MIMO detector, 2009 IEEE Int. Solid-State Circuits Conf., pp. 256-257, 257a. [9] J. W. Choi, B. Shim, A. C. Singer, and N. I. Cho, Low-complexity decoding via reduced dimension maximum-likelihood search, IEEE Trans. Signal Proc., vol. 58, no. 3, pp. 1780-1793, Mar. 2010. [10] P. W. Wolniansky, G. J. Foschini, G.D. Golden, and R.A. Valenzuela, V-BLAST: An architecture for realizing very high data rates over the rich scattering wireless channel, Proc. Int. Symp. Signals, Systems, and Electronics, 1998, pp. 295-300. [11] G. D. Golden, J. G. Foschini, R. A. Valenzuela, and P. W. Wolniansky, Detection algorithm and initial laboratory results using V-BLAST space-time communication architecture, Electronics Letters, vol. 35, no. 1, pp. 14-15, Jan. 1999. [12] Y. Jiang, X. Zheng, and J. Li, Asymptotic performance analysis of V- BLAST, IEEE GLOBECOM 2005, pp. 3882-3886. [13] S. Fouladi Fard, A. Alimohammad, and B. F. Cockburn, Improved layered MIMO detection algorithm with near-optimal performance, IET Electronics Letters, vol. 45, no. 13, pp. 675-677, June 18, 2009. [14] A. Pankeu Yomi and B. F. Cockburn, Near-optimal and efficient MIMO detectors for 64-QAM symbols, IEEE Cdn. Conf. Electrical and Computer Eng. (CCECE 2010), May 2-5, Calgary, AB, 6 pp.
4 4 MIMO Detector
M Real Multiplications Real Additions Real Reciprocals Fully Parallel Time in Cycles
Table 1. Symbol Vector Computational Complexity of Alternative 4 4 MIMO Detectors Note: Numbers in brackets give counts relative to the MMSE detector with the corresponding value of M. 707