- Frontend
  
  Backend
  
  Software
  
  Mobile App
  
  Automation
  
  Platforms
  
  Artificial Intelligence
  
  Machine Learning
  
  DevOps
  
  Data Science
  
  Cloud
  Frontend
  Angular Developer React Developer Vue Developer Javascript Developer UI/UX Developer HTML/CSS Developer Tailwind CSS Developer
  
  Your Very Own UI/UX Architects
  Experience smooth navigation and user-friendly designs with our front-end expertise.
  Hire Frontend Developer
  
  Back End
  Laravel Developer Python Developer Golang Developer Ruby on Rails Developer Node Developer .NET Developer .NET Core Developer Java Developer Spring Boot Developer R Developer PHP Developer Django Developer Rust Developer C# Developer
  
  Server Solutions To Change Power Dynamics
  Transform your data into digital experiences with optimized coding standards.
  Hire Backend Developer
  
  Software
  Software Developer Software Tester Full Stack Developer Offshore Developer Remote Developer
  
  Dedicated Talent With Skilled Approach
  Bring your digital visions to life with a hired resource at your convenience.
  Hire Dedicated Developer
  
  Mobile App
  React Native Developer Flutter Developer Ionic Developer Kotlin Developer iOS Developer Android Developer
  
  Innovating Mobile-Friendly App Solutions
  Create dynamic mobile apps that make your brand stand out from the crowd.
  Hire Mobile App Developer
  
  Automation
  RPA Developer UiPath Developer
  
  Automating At The Edge of Efficiency
  Scale your development processes to the edge of automation for improved efficiency.
  Get Quote
  
  Platforms
  Salesforce Developer MS Dynamics 365 Developer ServiceNow Developer
  
  Fueling Possibilities of Customer Engagement
  Improving customer engagement with advanced CRM solutions.
  Get Quote
  
  Artificial Intelligence
  AI Developer ChatGPT Developer Alexa Skill Developer OpenAI Developer Pytorch Developer Prompt Engineer AIOps Engineers
  
  Combining Today The Tech of the Future
  Dive into the domain of tomorrow and bring the future of AI to today's apps.
  Get Quote
  
  Machine Learning
  ML Developer Neural Network Developer
  
  Teaching Your System To Learn And Predict
  Leverage the power of machines and benefit your business with unique ML algorithms.
  Get Quote
  
  DevOps
  DevOps Developer DevsecOps Developer
  
  Connecting Development With Operations
  Bridging the gap between development and operations for seamless software development.
  Get Quote
  
  Data Science
  Tableau Consultant Data Analyst Data Scientist PowerBI Consultant Data Engineer Qlik Developer Automation Anywhere Developer
  
  Guiding Decisions WIth Data-Driven Insights
  Transition from your gut calls to actionable insights with our rich Data Science expertise.
  Get Quote
  
  Cloud
  Cloud Developer AWS Developer Azure Developer Google Cloud Developer
  
  Redefining Scalable Digital Infrastructures
  Make your data accessible worldwide at will, and leave the stress behind.
  Get Quote
Portfolio
Contact Us

Bacancy Technology

Bacancy Technology represents the connected world, offering innovative and customer-centric information technology experiences, enabling Enterprises, Associates and the Society to Rise™.

12+

Countries where we have happy customers

1050+

Agile enabled employees

World wide offices

12+

Years of Experience

Agile Coaches

Certified Scrum Masters

1000+

Clients projects

1458

Happy customers

Golang

June 28, 2024

Extracting HTML Body Content as a String in Go

When working with web scraping or manipulating HTML content in Go, you might often need to extract the content inside thetag and convert it into a string. This can be particularly useful when you want to process or analyze the body content of web pages. In this blog post, we’ll walk through how to achieve this using Go.

Prerequisites

Before we dive into the code, make sure you have Go installed on your machine. If not, you can download it from the official Go website.

We’ll also be using the following packages:

net/http for making HTTP requests.
golang.org/x/net/html for parsing the HTML content.
You can install the html package from golang.org/x/net using the following command:

bash
go get golang.org/x/net/html
Step-by-Step Guide

Step 1: Fetch the HTML Content

First, we need to fetch the HTML content of the web page. We’ll use the http package for this.

package main
import (
    "fmt"
    "net/http"
    "io/ioutil"
)

func fetchHTML(url string) (string, error) {
    resp, err := http.Get(url)
    if err != nil {
        return "", err
    }
    defer resp.Body.Close()
    body, err := ioutil.ReadAll(resp.Body)
    if err != nil {
        return "", err
    }
    return string(body), nil
}

func main() {
    url := "http://example.com"
    htmlContent, err := fetchHTML(url)
    if err != nil {
        fmt.Println("Error fetching HTML:", err)
        return
    }
    fmt.Println(htmlContent)
}

Step 2: Parse the HTML and Extract the Body Content

Next, we’ll parse the HTML content and extract the content inside the tag. For this, we’ll use the html package.

package main

import (
    "fmt"
    "net/http"
    "io/ioutil"
    "golang.org/x/net/html"
    "bytes"
)

func fetchHTML(url string) (string, error) {
    resp, err := http.Get(url)
    if err != nil {
        return "", err
    }
    defer resp.Body.Close()

    body, err := ioutil.ReadAll(resp.Body)
    if err != nil {
        return "", err
    }

    return string(body), nil
}

func extractBodyContent(htmlContent string) (string, error) {
    doc, err := html.Parse(bytes.NewReader([]byte(htmlContent)))
    if err != nil {
        return "", err
    }

    var bodyContent string
    var f func(*html.Node)
    f = func(n *html.Node) {
        if n.Type == html.ElementNode && n.Data == "body" {
            for c := n.FirstChild; c != nil; c = c.NextSibling {
                var buf bytes.Buffer
                html.Render(&buf, c)
                bodyContent += buf.String()
            }
        }
        for c := n.FirstChild; c != nil; c = c.NextSibling {
            f(c)
        }
    }
    f(doc)
    return bodyContent, nil
}
func main() {
    url := "http://example.com"
    htmlContent, err := fetchHTML(url)
    if err != nil {
        fmt.Println("Error fetching HTML:", err)
        return
    }
    bodyContent, err := extractBodyContent(htmlContent)
    if err != nil {
        fmt.Println("Error extracting body content:", err)
        return
    }
    fmt.Println(bodyContent)
}

Explanation
Fetching HTML Content: We make an HTTP GET request to the specified URL and read the response body.

Parsing HTML: We parse the HTML content using html.Parse.

Extracting Body Content: We traverse the parsed HTML nodes to find the tag. Once found, we extract its inner content by rendering each child node of the tag back to a string.

Running the Code
To run the code, simply save it to a file, for example main.go, and execute it using the following command:

bash

go run main.go
Replace http://example.com with the URL of the web page you want to process.

Conclusion

In this blog post, we’ve shown how to fetch HTML content from a web page and extract the content inside the tag as a string using Go. This method can be particularly useful for web scraping and HTML content processing. With the power of Go’s standard library and the golang.org/x/net/html package, handling and manipulating HTML content becomes straightforward and efficient.

Support On Demand!

Golang

Extracting HTML Body Content as a String in Go

Prerequisites

Step 1: Fetch the HTML Content

Step 2: Parse the HTML and Extract the Body Content

Conclusion

Support On Demand!

Related Q&A

Golang Cast Interface to Struct

Golang String to int

Understanding nil in Go

Difference Between any/interface{} as Constraint vs. Type of Argument

Optional Parameters in Go?

Reading a File Line by Line in Go: A Comprehensive Guide

079 4003 7674

solutions@bacancy.com