Complexity Categories

Chinese | English

In this chapter, we'll first clarify the definitions of these terms: NP, NP-hard, NP-complete, co-NP, P vs NP.

Before encountering this knowledge, I always thought these terms were very advanced and profound. Today, let's explore them together.

Review: P problems are the set of problems that can be solved in poly-time complexity.

Verification

Whether a problem belongs to NP is not about whether the problem is difficult to solve, but about verifying it! ** The core of NP is:

Given a candidate solution, can you "verify" its correctness in polynomial time?

This is the key point; we don't care whether it's easy to solve.

Formal Definition of NP (but expressed intuitively)

NP = Nondeterministic Polynomial-time.

Formal definition: A problem is NP if and only if

Every YES instance has a polynomial-length certificate, and this certificate can be verified in poly-time.

Examples of Some NP Problems

Independet Set Problem

Def. In a graph G=(V,E), we say that a set of nodes SE is an independent set if no two nodes in S are joined by an edge.

That is: a subset of nodes in a graph G where no two nodes are joined by an edge.

Therefore, the Independent Set Problem is NP-complete.

SAT Problem and 3-SAT Problem

Definition of a Clause:

Given n boolean variables x1,x2,...,xn, a clause is a disjunction of terms t1t2...tl, where ti{x1,...,xn,x1,...,xn}

Essentially, it's just a bunch of variables that are ORed with 0 or 1.

The SAT Problem is defined as follows:

Given clauses C1,...,Ck, if there is an assignment that can satisfy all the clauses: C1C2,...,Ck.

In essence, it asks whether it's possible to find an assignment that satisfies all clauses.

Clearly, this problem can also be verified in poly-time.

Therefore: NP = the set of all problems that can "verify YES instances" in polynomial time.

Solving may be difficult, but verification must be fast.

Clique Problem

Problem CLIQUE(G,k) asks whether, given a graph G, does G contain a k-clique? A k-clique is defined to be a set of k vertices such that each pair of these vertices shares an edge.

Given a graph, can we find k nodes that are connected in pairs?

Vertex Cover

In a graph, can we find a set of vertices that covers all edges? An edge is covered if one of its two ends belongs to the Vertex Cover.

Hamiltonian Cycle (Decision Version)

Find a cyclic path in graph G, traversing each vertex once.

What is NP-hard?

NP-hard = "At least as hard as the hardest problem in NP".

More formally:

NP-hard problems are those problems to which all NP problems can be poly-time reduced.

The Halting Problem is one of the most fundamental results in theoretical computer science. It asks:

Given a program P and an input x, will P(x) eventually halt, or will it run forever?

Alan Turing proved in 1936 that there is no algorithm that can always answer this question correctly for all possible programs and inputs.

Therefore:

NP-complete NP-hard

NP-hard is a larger set.

The complete and rigorous proof of whether a problem belongs to NP is at the end of this chapter: How to Prove a Problem Belongs to NP

What is NP-complete?

NP-complete = NP NP-hard.

In other words:

A problem X is NP-complete if and only if:

  1. X ∈ NP

  2. All problems in NP can be poly-time reduced to X

These problems are the most difficult "representative problems" in NP.

Common NP-complete problems:

  1. SAT

  2. 3SAT

  3. CLIQUE

  4. Independent Set

  5. Vertex Cover

  6. Hamiltonian Cycle

  7. TSP (decision)

They are mutually reducible, therefore essentially of equal difficulty.

P vs NP: The Most Important Problem in Computer Science

From the above, we know:

Obviously:

PNP

But is:

P = NP?

Currently unknown.

[!important]

Current conclusion: Whether P equals NP is unknown!

However, most computer scientists believe:

P ≠ NP

What would happen if P = NP?

Search/scheduling/TSP problems could be solved instantaneously.

But currently, this problem remains unsolved and is one of the Seven Millennium Problems.

How to prove a problem is NP

Example 1

The Sushi Express wants to implement a new feature, "order batching." The menu consists of a set of disired food items F and set of combos c1,...,cn where each combo is a set of food items along with a price - we represent combo ci=({fi1,fi2,...,fik},pi), where each fij is a food item in the combo. If a student orders ci, they receive each of the food items, and they pay the cost pi. In the new order batching system, the students would input a list of food items that they want, and the order batching system should find the cheapest set of combos, which would include at least one of each desired food item. Write the decision version of this problem, and show via reduction that this decision problem is NP Hard.

This is a homework question for me. Of course, I won’t prove it here. HP-Hard, let's first prove that the Decision Version of the problem is NP.

Define the Decision Version

We define the decision version of the Sushi Batching problem as follows:

Input:

Question:

Does there exist a subset of combos I{1,...,n} such that:

  1. iISiR (Unite to cover all the food you want), and

  2. iIpiB

Showing the Problem is in NP

A certificate can be a subset of combo indices I{1,...,n}. The certifier checks the following in polynomial time:

  1. Compute iIpi and verify it is B

  2. Compute the union iISi and verify that it contains every element of R.

  3. Verify that I have no duplicate indices.

These checks require only scanning sets and summing numbers, all of which take polynomial time. Thus the problem has an efficient certification and is in NP.

Example 2

Problem CLIQUE(G,k) asks whether, given a graph G, does G contain a k-clique? A k-clique is defined to be a set of k vertices such that each pair of these vertices shares an edge. Show via reduction that CLIQUE is NP-complete.

Again, first prove that the problem is NP.

Showing the Problem is NP

Certificate:

A certificate can be a subset of vertices SV with |S|=k.

Certifier:

Given (G,k) and the certificate S, the certifier is:

  1. Checks that |S|=k

  2. For every pair (u,v) in S, checks whether (u,v)E

There are at most k/2 pairs, so this verification takes polynomial time in the size of the input graph. Thus, CLIQUE has an efficient certificate and is NP.

The proof should be straightforward by now. Proving a problem is NP is very simple; just state that the problem can be verified in poly-time.