Posted by: atri | September 12, 2007

## Project 4: Codeword Testing

Given an $(n,k,d)_q$ code $C$, a codeword tester for $C$ is a randomized algorithm (call it $T$) with the following properties (below $\mathbf{y}$ is the received word and $\epsilon >0$ and $\ell\ge 1$ are some fixed real and integer respectively):

1. If $\mathbf{y}\in C$ then $T$ accepts $\mathbf{y}$ with probability $1$.
2. If $\Delta(\mathbf{y},C)>\epsilon n$ then $T$ rejects with probability at least $\frac{2}{3}$.
3. $T$ queries at most $\ell$ positions in $\mathbf{y}$.

Let us try to get a sense of the parameters. If $\ell=n$, then of course can get a tester $T$— the problem just reduces to error detection. However, the catch is that the number of queries $\ell$ needs to be at least sub-linear and ideally, a constant independent of $n$. It is not too hard to convince yourself that in such a case one cannot have a deterministic algorithm (which is why in the definition $T$ is allowed to be a randomized algorithm). Further, one cannot hope to be able to decide if for every $\mathbf{y}$ whether $\mathbf{y}\in C$. Thus, in property 2. above, we only need to reject received words that have a large Hamming distance for every codeword (for received words in the “middle” $T$ is allowed to have arbitrary behavior). Codes with sub-linear query complexity tester are called Locally Testable Codes (or LTCs).

Let us now pause for a bit and think why we would like to study such a notion. Here are two reasons: the first one is “practical” and is motivated by coding theory considerations while the second is not practical at all and is motivated by complexity theory (though the motivation is pretty darn strong):

1. Such testers as above can be used as a filter. They can serve as light weight procedures (recall that the ideal tester only probes constant many places and hence, the running time will be constant) that can detect received words that have more errors than (some) decoding algorithm for $C$ can tolerate. The advantage is that we save on running our “expensive” decoding algorithms on received words that it cannot recover from.
2. Perhaps the most striking use of codeword testing is in the proof of the PCP theorem. (The theorem was first proven in two papers by Arora, Lund, Motwani, Sudan, Szegedy and Arora, Safra.) Without going into the details, the PCP theorem has been one of the most influential results in complexity theory and has lead to, among other things, a much better understating of the approximability of NP-complete problems. Unfortunately, describing this connection takes a major part of a semester long course (and is out of scope of this puny blog post).

Now the most fundamental question regarding codeword testing is the following:

• Does there exist a family of codes with constant rate and constant relative distance that can be tested with constant number of queries?

For ease of reference I’ll refer to codes that answer the question above in the affirmative as ideal LTCs. Remarkably, it is not known if ideal LTCs exits or not.
Your task in this project will be to study the literature related to tackling the above question (either from the perspective of proving upper bounds or lower bounds). Here are some papers that should help you start out.

• Upper bounds
• The paper by Kaufman and Ron titled Testing Polynomials over General Fields (FOCS04) looks at designing testers for a particular family of codes called Reed-Muller codes (which we will see later in the course). Codeword testers for Reed-Muller codes were crucial in the early proofs of the PCP theorem (the recent proof of the PCP theorem by Dinur does not use Reed-Muller codes. Dinur’s paper also holds the record for the LTC with the best known rate for constant query complexity). The paper by Kaufman and Ron essentially resolves the question for getting testers for Reed-Muller codes with optimal query complexity.
• Recently, there has been an interesting interplay between PCPs and LTCs: LTCs have been constructed using (variants of) PCPs. A good place to start studying these connections is Goldreich‘s survey.
• Kaufman and Litsyn in their paper titled Almost Orthogonal Linear Codes are Locally Testable (FOCS05) looks at the codeword testing problem for certain classes of codes. The proof techniques are different from the ones used in the sequence of work leading up to the paper of Kaufman and Ron.
• You can also look at this chapter in my thesis (based on joint work with Guruswami) that looks at a slightly different version of codeword testing (motivated by the “practical” angle mentioned a few paragraphs back). It also has a survey of some of the results in the usual codeword testing setting.
• Lower bounds (Unfortunately not much is known for lower bounds but here are some results)

In short [:-)], codeword testing is a very interesting area where lots of research still needs to be done (especially in the obtaining lower bounds).