Post Top Ad

Optical Character Recognition (OCR)


For the 3rd year SEP project we  had to create a management dashboard. The requirement was the management  dashboard should give you the high level over view of the existing system. so there i wanted to read existing data of the repots and show them in the dashboard. I achieved using OCR technology. 

Requirements

This solution requires that you have Microsoft Office 2003 or above  installed. Since office 2010 do not support this method you will need to have previous version of office packages installed.  Office has included a library called, Microsoft Office Document Imaging, (MODI . The dll file what we really needs to reference from our code is called MDIVWCTL.DLL. 
And also this library needs to be installed in any client machine, if we want to use the office document imaging facilities to do the OCR functionality

Assumptions

Here, we’ll assume that you already have an Image to work with. we need a area where we want to extract the image details and show them in the dashboard so lets take values as X1, Y1, X2, Y2. Let the (X1,Y1) be the top left corner of the area, and that (X2, Y2)  be the  right corner of the area.

How solution Works

  private void btn_View(object sender, RoutedEventArgs e)
    {
        string result = "";
        Bitmap b = new Bitmap("..\\..\\Priyankara\\testBMP.jpeg");

        //calling Extract method
        // var newBmp = Extract(b , 1, 88, 290, 126);
        Bitmap newBmp = Extract(b, 816, 584, 1161, 612);
        if (newBmp == null)
        {
            result = string.Empty;
        }
        else
        {
            String tempFile = CreateTempFile(newBmp);
            result = OcrTempFile(tempFile);


            TextWriter tsw = new StreamWriter(@"..\\..\\Priyankara\\xmlWriter.txt"); // true without overriting


            //Writing text to the file.
            tsw.WriteLine(result);


            //Close the file.
            tsw.Close();
          //  MessageBox.Show("File written Successfully");


        }

        //converting text File to XML

        String[] data = File.ReadAllLines(@"..\\..\\Priyankara\\xmlWriter.txt");
             XElement root = new XElement("orders",
                                        from item in data
                                        select new XElement("Values", item));
             root.Save(@"..\\..\\Priyankara\\xmlWriter.xml");

        //Loading text File

             XDocument xmlDoc = XDocument.Load(@"..\\..\\Priyankara\\xmlWriter.xml");
             String[] test = xmlDoc.Descendants("orders").Elements("Values").Select(r => r.Value).ToArray();
             String result1 = string.Join("|", test);

            

             string fullName = result1;
             var names = fullName.Split('|');
             string firstName = names[0];
            string lastName = names[1];
            string ccc = names[2];
       // TextBox1.t
             txtUsername.Text = names[0];
             txtUsername1.Text = names[1];
             txtUsername2.Text = names[2];
          //  MessageBox.Show(ccc);

        // Early
        //string[] letters = result.Select(c => c.ToString()).ToArray();

        //textBox1.Text = Convert.ToString(result[0]);       

        }

        //Extracting part of the image
        Bitmap Extract(Bitmap bmp, int x1, int y1, int x2, int y2)
        {
            try
            {
                int width = x2 - x1;
                int height = y2 - y1;
                if (bmp == null || width < 1 || height < 1)
                {
                    return null;
                }
                Bitmap subImage = bmp.Clone(new System.Drawing.Rectangle(x1, y1, width, height), bmp.PixelFormat);
                return subImage;
            }
            catch
            {
                return null;
            }
        }

        // save the newly created image into a temp file.
        private string CreateTempFile(Bitmap img)
        {
            String fId = Guid.NewGuid().ToString("N");
            String path = string.Format("{0}{1}.tiff", System.IO.Path.GetTempPath(), fId);
            img.Save(path, ImageFormat.Tiff);
          //  Globals.TempFiles.Add(path);
            return path;
        }

     // read the data from the image.
        [HandleProcessCorruptedStateExceptions]
        string OcrTempFile(string path) {
            try {

                MODI.Document Modi = new MODI.Document();
                Modi.Create(path);
                Modi.OCR(MODI.MiLANGUAGES.miLANG_ENGLISH, false, false);
                MODI.Image img = (MODI.Image)Modi.Images[0];
                MODI.Layout layout = img.Layout;
                String str = layout.Text;
                Modi.Close();
                return str.Trim();

            } catch
            {
                return string.Empty;
            }
        }

        private void textBox1_TextChanged(object sender, TextChangedEventArgs e)
        {

        }


1 comment:

  1. I'm not a developer, i always use the free online ocr to recognize and scan text from image.

    ReplyDelete

My Instagram