c++ - Removing lines from image -
i beginner in opencv, need remove horizontal , vertical lines in image text remains ( lines causing trouble when extracting text in ocr ). trying extract text nutrient fact table. can me?
this interesting question, gave shot. below show how extract , remove horizontal , vertical lines. extrapolate it. also, sake of saving time, did not preprocess image crop out background 1 should, avenue improvement.
the result: code (edit: added vertical lines):
#include <iostream> #include <opencv2/opencv.hpp> using namespace std; using namespace cv; int main(int, char** argv) { // load image mat src = imread(argv[1]); // check if image loaded fine if(!src.data) cerr << "problem loading image!!!" << endl; mat gray; if (src.channels() == 3) { cvtcolor(src, gray, cv_bgr2gray); } else { gray = src; } //inverse binary img mat bw; //this hold result, image passed ocr mat fin; //i find otsu binarization best text. //would perform better if background had been cropped out threshold(gray, bw, 0, 255, thresh_binary_inv | thresh_otsu); threshold(gray, fin, 0, 255, thresh_binary | thresh_otsu); imshow("binary", bw); mat dst; canny( fin, dst, 50, 200, 3 ); mat str = getstructuringelement(morph_rect, size(3,3)); dilate(dst, dst, str, point(-1, -1), 3); imshow("dilated_canny", dst); //bitwise_and w/ canny image helps w/ background noise bitwise_and(bw, dst, dst); imshow("and", dst); mat horizontal = dst.clone(); mat vertical = dst.clone(); fin = ~dst; //image horizontal lines mat horizontal = bw.clone(); //selected value arbitrarily int horizontalsize = horizontal.cols / 30; mat horizontalstructure = getstructuringelement(morph_rect, size(horizontalsize,1)); erode(horizontal, horizontal, horizontalstructure, point(-1, -1)); dilate(horizontal, horizontal, horizontalstructure, point(-1, -1), 1); imshow("horizontal_lines", horizontal); //need find horizontal contours, not damage letters vector<vec4i> hierarchy; vector<vector<point> >contours; findcontours(horizontal, contours, hierarchy, cv_retr_tree, cv_chain_approx_none); (const auto& c : contours) { rect r = boundingrect(c); float percentage_height = (float)r.height / (float)src.rows; float percentage_width = (float)r.width / (float)src.cols; //these exclude contours not dividing lines if (percentage_height > 0.05) continue; if (percentage_width < 0.50) continue; //fills in line white rectange rectangle(fin, r, scalar(255,255,255), cv_filled); } int verticalsize = vertical.rows / 30; mat verticalstructure = getstructuringelement(morph_rect, size(1,verticalsize)); erode(vertical, vertical, verticalstructure, point(-1, -1)); dilate(vertical, vertical, verticalstructure, point(-1, -1), 1); imshow("verticalal", vertical); findcontours(vertical, contours, hierarchy, cv_retr_tree, cv_chain_approx_none); (const auto& c : contours) { rect r = boundingrect(c); float percentage_height = (float)r.height / (float)src.rows; float percentage_width = (float)r.width / (float)src.cols; //these exclude contours not dividing lines if (percentage_width > 0.05) continue; if (percentage_height < 0.50) continue; //fills in line white rectange rectangle(fin, r, scalar(255,255,255), cv_filled); } imshow("result", fin); waitkey(0); return 0; }
the limitations of approach lines need straight. due curve in bottom line, cuts "e" in "energy". perhaps hough line detection suggested (i've never used it), similar more robust approach devised. also, filling in lines rectangles not best approach.
Comments
Post a Comment