mlapi
HomeBlog
  • Welcome
  • Extract Text from PDF
    • Parameters
    • Supported File types
    • Error Responses
    • Programming Languages
      • Python
      • Javascript
      • Curl
      • Java
      • Go
      • Rust
      • C++
    • Web and Mobile frameworks
      • React
      • React Native (Javascript)
      • HTML
      • PHP
      • Rust (with Active-Web)
      • Angular
      • Flutter
      • Andriod Development (Java)
      • Swift
      • Vue.js
      • Svelte
    • Pricing
    • On Premise Deployment
Powered by GitBook
On this page
  • Input Parameters
  • Sample inputs
  • Output Parameters
  • Sample Ouput
  1. Extract Text from PDF

Parameters

This page contains all the input and output parameters.

Input Parameters

Parameters
Description

pdf_url

string, pass an url where pdf is uploaded.

file

blob, if you want to pass an file as input for processing. File limit - 50 MB.

api_key

string, pass an api key, for testing purpose you can set it as free like below - {"api_key" : "free"}

detect_lang

boolean, by default false. Detect language. If set to true then in the response you will have a key like this for detected language {"detected_lang" : "en"}

Sample inputs

For sample input, go to Programming Languages or Web and Mobile frameworks.

Output Parameters

text

string, this is extracted pdf content

processing_time

string, total processing time

metadata.source_type

string, this has two possible responses, if you send pdf_url as input, then source_type will be URL and if you send file as input for processing then source_type will be File

metadata.source_info

string, if pdf url is uploaded then you will get pdf url in response and if file is uploaded then the response will be Uploaded file.

metadata.page_count

number, no of pages in pdf

Sample Ouput

  1. With url

{
  "metadata": {
    "page_count": 1,
    "source_info": "https://pdfobject.com/pdf/sample.pdf",
    "source_type": "URL"
  },
  "processing_time": "0.36 seconds",
  "text": "This is a sample output from https://pdfobject.com/pdf/sample.pdf pdf."
}
  1. With PDF file

{
  "metadata": {
    "page_count": 3,
    "source_info": "Uploaded file",
    "source_type": "File"
  },
  "processing_time": "1.88 seconds",
  "text": "This is a sample output of uploded pdf file."
}
PreviousExtract Text from PDFNextSupported File types

Last updated 9 months ago